Agentic AI Course For Beginners 2026 | AI Agents Tutorial | Agentic AI Course | Edureka | DailyDevLists

Loading video player...

Full Transcript

73,265 words • EN

[music]

Hello everyone and welcome to the

Agentic AI [music] course for beginners.

Your starting point into the new era of

intelligent autonomous [music] AI

systems. Agentic AI is transforming how

machines work. Instead of just

responding to [music] prompts, AI agents

can now reason, plan, take actions, and

work across tools to complete [music]

task on their own. From smart assistants

to automated research agents and

workflow bots, this is the next big

[music] shift in artificial

intelligence. In this course, you will

learn the core building blocks [music]

behind agentic AI from understanding

deep learning, LLMs and transformers

[music]

to exploring lang chain, rag, LLM ops

and the modern [music] agent frameworks.

Whether you're completely new to AI or

already exploring generative models,

this course makes the concept simple,

[music] practical and beginner friendly.

By the end you will understand how

agentic systems work, how to build basic

[music] agents and how to start your

journey in this fast growing fit. So

before we begin, please like, share and

subscribe to Edureka's YouTube channel

and hit the bell icon to stay [music]

updated on the latest content from

Idureka. Also check out Edureka's

[music]

agentic AI certification training. It is

carefully crafted to meet [music]

industry demands and prepare you for the

future of intelligent agents. You will

gain practical [music] skills in lang

chain rag LLM ops and more through live

instructorled sessions and hands-on

labs. Whether you're [music] a beginner

or a tech professional, this course

helps you master the concepts and

accelerate your AI career. So check out

the course link given [music] in the

description box below. Now let us get

started by understanding what [music]

agentic AI is.

Agentic AI is transforming industries by

allowing machines to learn, adapt, and

evolve independently. Similar to live

organisms.

Unlike traditional AI, these intelligent

agents investigate, optimize, and

develop solutions over time without

requiring direct human participation.

Recent advancements include OpenAI's

deep research, which automatically

analyzes massive amounts of data to

provide detailed reports, and Google's

Gemini 2.0, O which improves AI's

capacity to plan and reason across

different data types. Service Now's AI

agent orchestrator is transforming

enterprise automation by coordinating

many AI agents to address difficult

business concepts. As these systems

become more powerful, they have the

potential to unlock ideas beyond the

human imagination ranging from wind

turbine blade design to AIdriven company

management. Let's start with our first

topic. What is agentic AI? Agentic AI

denotes artificial intelligence systems

capable of autonomously executing

actions to attain designated objectives

unlike reactive AI which only responds

to the inputs. Agentic AI is proactive

capable of planning, adapting and making

decisions autonomously. So let's explore

deep into agentic AI and see its

capabilities. Agentic AI is a type of

artificial intelligence that exhibits

autonomous behavior, enabling it to take

actions and operate without continuous

human guidance. It is goal-driven,

actively working towards achieving

specific objectives rather than

passively responding to inputs like

reactive AI. And with advanced

decision-m capabilities, it can evaluate

multiple options, select the optimal

course of action based on current

conditions and acquired knowledge and

adapt its strategies dynamically in

response to unforeseen changes in its

environment. Moreover, agentic AI

demonstrates proactiveness by taking the

initiative to act rather than waiting

for external triggers making it highly

effective in dynamic and complex

scenarios. Now let us see its relevance

in the current AI market. When AI

systems can act autonomously to

accomplish predefined objectives, we

call that agentic AI, making it highly

relevant in the current AI market. Its

autonomy allows it to operate without

continuous human guidance, making

decisions and adapting dynamically to

achieve objectives. This capability is

complemented by its advanced problem

solving skills, enabling it to evaluate

complex situations, strategize and

respond effectively to challenges.

However, the growing adoption of agentic

AI also rises important ethical

considerations such as ensuring

responsible behavior, minimizing

unintended consequences and maintaining

transparency in its decision-m

processes. Now that you know about

agentic AI, so let us discuss how it

differ from other AI systems. Agentic AI

differs significantly from other AI

systems in its autonomy, decision making

and adaptability to achieve long-term

goals. Unlike reactive AI which performs

predefined task only when prompted such

as spam filters or image classifiers,

agentic AI takes the initiative and

operates independently. It also contrast

with the generative AI which focuses on

creating content like child GPT

generating text but it is not

goal-driven by combining autonomous

behavior, strategic decision making and

the ability to adapt dynamically.

Agentic AI stands out as a powerful

system designed to achieve specific

objectives in evolving environments. Now

since we know a bit of differences, let

us see the comparison between generative

AI and agentic AI. Generative AI and

agentic AI differ in several key aspects

that define their functionality and

applications. Generative AI is primarily

focused on creation, excelling in output

focused tasks such as generating text,

images or other form of content. Its

adaptability is limited as it relies

heavily on prompts for guidance and

lacks the ability to operate

independently. In contrast, agentic AI

emphasizes autonomy, making it

goal-driven and capable of dynamically

adapting to changing environments.

Unlike the prompt dependent nature of

generative AI, agentic AI is

self-directed, enabling it to take the

initiative and execute strategic task

effectively. These differences highlight

the complimentary roles of both AI types

in addressing distinct challenges. Now

let us see the impact of agentic AI on

various industries. Agentic AI has had a

profound impact across various

industries transforming operations and

solving long-standing challenges.

Autonomous logistics systems such as

those in Amazon warehouses have

significantly improved operational

efficiency by 30 to 40%. In healthcare,

AI enabled surgical robots like the

Davinci system have performed over 10

million less invasive procedures

worldwide, enhancing precision and

patient outcomes. Scientific

advancements have also been transformed

by systems like deep minds alpha fold

which successfully solved the decades

old protein folding problem. On a global

scale, the World Economic Forum predicts

that by 2025, AI will displace 85

million jobs while creating 97 million

new ones, reshaping the labor market.

And in the energy sector, AI powered

smart grids can reduce electricity waste

by up to 10%. Promoting greener energy

solutions. Additionally, over 90

countries are investing in AI enabled

military technology to modernize their

defense systems, showcasing the

strategic importance of agentic AI in

global security. Now, let us see the

applications of agentic AI. Agentic AI

is transforming various industries by

enabling systems to make autonomous

decisions, adapt to changing

environments, and achieve specific

goals. Autonomous vehicle powers

self-driving cars and drones to navigate

roads, avoid obstacles, and make

real-time decisions as seen with Tesla

autopilot and autonomous delivery

drones. In robotics, agentic AI allows

industries healthcare and exploration

robots to perform complex task

independently as demonstrated by Boston

Dynamics robots used in logistics and

rescue operations. Personalized virtual

assistants like Google Assistant and

Amazon Alexa leverage agentic AI to

predict user needs, manage schedules,

and execute task without direct

commands. And in gaming, adaptive AI

agents enhance the experience by

creating challenging humanlike opponents

such as Alph Go and AI boards into

realtime strategy games. In healthcare,

agentic AI supports personalized

treatments, accurate diagnostics, and

surgical assistance with examples

including AIdriven surgical robots and

systems for remote patient monitoring.

These applications demonstrate the

transformative potential of agentic AI

across diverse domains. Agentic AI is

making a significant impact across

various industries by enabling autonomy,

adaptability, and efficiency in diverse

applications. In finance, it powers

algorithmic trading systems and fraud

detection tools, optimizing financial

operations such as managing investment

portfolios and identifying fraudulent

activities. In smart cities, AI systems

manage energy consumptions, optimize

traffic flow and enhance public safety

with examples like smart traffic lights

adapting in real time and autonomous

energy grid optimization. In space

exploration, autonomous spacecraft and

planetary rovers such as NASA's Mars

rovers perform exploration task

independently. In education, AI powered

tutors like Carnegie Learning provide

personalized instruction by adapting to

individual learning styles. In military

and defense, autonomous drones and

surveillance system improves situational

awareness and decision making such as

AIdriven surveillance drones in defense

applications. Now let us see the

challenges and risk associated with

agentic AI. While agentic AI offers

tremendous potential, it also faces

several challenges and risk that must be

addressed to ensure its safety and

ethical deployment. So one key concern

is misalignment with human goals where

AI system may pursue objectives that

conflict with human intentions due to

poorly defined parameters or intended

unintended consequences such as

autonomous robot prioritizing efficiency

over safety. Ethical questions arise

regarding accountability and decision-m

demonstrated by the challenge of

determining who is responsible when an

autonomous vehicle causes an accident.

The complexity of decision-m in agentic

AI can also lead to a lack of

transparency making it difficult to

understand or explain its actions

particularly in sensitive fields like

healthcare or finance. Ensuring safety

and reliability is another challenge as

AI systems must operate effectively in

unpredictable environments such as

autonomous drones encountering extreme

weather or medical failures.

Additionally, agentic AI systems often

require substantial computational

resources making their deployment costly

as seen in advanced robotics and

self-driving cars. Security

vulnerabilities pose further risk as

autonomous systems could be targeted by

cyber attacks potentially leading to

harmful consequences like the

manipulation of autonomous vehicles.

Lastly, overdependence on AI may reduce

human oversight or lead to skill

degradation in critical areas such as

relying too heavily on autonomous

systems for medical diagnosis without

human validation. These challenges

highlight the need for robust design,

rigorous testing and ethical frameworks

to mitigate risk and maximize the

benefits of agentic AI. Now let's see

the future of agentic AI. The future of

agentic AI is set to be transformative

with advancements across various domains

influencing its deployment. Future

systems will exhibit increased autonomy

and adaptability, enabling them to make

a complex decisions in real time and

operate effectively in dynamic

environments without human intervention.

The integration of agentic AI with

advanced technologies like quantum

computing, IoT, the edge computing will

further enhance its capabilities

allowing for faster decision making and

realtime processing at the edge. These

systems will have the widespread

applications in sectors such as

healthcare where they will enable

autonomous medical diagnostics,

personalized treatment plans and robotic

surgery. Climate action with advanced

systems for environmental monitoring and

response and space exploration where

smart rovers and spacecraft will carry

out missions on their own. As these

technologies evolve, ethical concerns

and accountability will need to be

addressed. promoting the development of

regulatory frameworks to ensure

responsive AI usage. Additionally,

agentic AI will foster human AI

collaboration, enhancing productivity

and creativity in the fields such as

education, engineering, and research.

[music]

Imagine asking Chad GP for a poem and it

writes one instantly. Now think about an

AI assistant planning your entire day,

booking meetings, and even handling

emails without your constant input.

That's the difference between generative

AI which creates content and agentic AI

which acts with autonomy making

decisions. In 2025, as AI becomes more

than just a tool, understanding the

shift is very critical. Are we heading

towards just smarter chatbots or truly

independent digital agents? Let's break

it down through this video. To truly

understand the ship, let's first break

down what generative AI is. Generative

AI is a type of artificial intelligence

designed to create content, whether it's

text, images, music, or even code.

Instead of making decisions or even

taking action on its own, it focuses on

producing outputs based on the patterns

it has learned from the vast amounts of

data. At its core, generative AI models

use deep learning techniques like

transformers to generate new content

that resembles human created work. For

example, Chad GPT generates humanlike

text based on prompts. Midjenny and Dali

creates stunning images from simple text

description and GitHub copilots helps

developers suggesting code snippets in

real time. Generative AI has several

strengths. It enhances creativity and

productivity allowing artists, writers

and programmers to work faster and even

more efficient. It scales effortlessly

generating unlimited variation of

content in just few seconds. It also

adapts responses based on user input

making interactions feel more

personalized. But it also comes with few

limitations. Generative AI lacks

autonomy. It doesn't think or act on its

own. It only responds when prompted. It

has no real decision-m abilities and

cannot evaluate consequences or make

even independent choices. Additionally,

it can generate biased or inaccurate

content based on the data that it has

seen. While generative AI is powerful

for creating, it cannot act

independently. And that's where agentic

AI comes in. Let's explore what agentic

AI is. Agentic AI goes beyond just

generating content. It acts autonomously

making decisions and executing tasks

without the need of constant human

input. Unlike generative AI which can

only responds to prompts, agentic AI can

plan, adapt and take initiatives based

on goals rather than the specific

instructions. At its core, agentic AI

combines reasoning, memory, and decision

making to operate more like an

independent agent. It doesn't just

create, it analyzes, strategize, and

acts. Real world examples include

autonomous robots which navigates and

complete the task on their own. AIdriven

personal assistant like those managing

schedules, booking flights and handling

emails without human oversight. Even

self-driving cars which continuously

assess their environment and make

split-second driving decisions. Agentic

AI has its own strengths. It reduces the

needs for manual intervention automating

the complex workflows. It adapts to real

world conditions, learning and improving

over time. It can even handle multi-step

tasks that require planning, execution,

and adjustment. But it also has its own

challenges. Developing truly autonomous

AI requires significant advancements in

reasoning and adaptability. There are

certain risks including unintended

behaviors and ethical concerns around

AI, which makes independent decisions.

And unlike generative AI which focuses

on creativity, agentic AI is limited in

how well it can generate novel content.

So while generative AI creates and

agentic AI acts, the real powers comes

when these two work together. Let's see

the key differences between generative

AI and agentic AI. Generative AI and

agentic AI serve different purposes,

each with unique strengths and

applications. The key distinction comes

down to creativity versus decision

making. As previously discussed,

generative AI focuses on producing

content, whether it's text, image, or

code. It enhances creativity by

assisting writers, designers, and

developers. But it lacks true autonomy.

It only works when prompted and doesn't

make any decision on its own. Agent AI,

on the other hand, is designed for

interactions and execution. Instead of

just generating responses, it can

analyze situations, make decisions, and

take actions. While it may not create

content like generative AI, it can

manage workflows, automate task and

adapt to real world conditions. Another

key difference is user dependency.

Generative AI is entirely reactive,

meaning it requires human input to

function. It waits for prompts before

generating anything. In contrast,

agentic AI is proactive. It can initiate

actions independently, setting

reminders, optimizing schedules, or even

solving problems without human

intervention. The applications of these

AI types also differ. Generative AI is

widely used in content creating,

marketing, entertaining, and software

development. And agentic AI powers

autonomous system like self-driving

cars, AI powered customer service and

personal assistant that can handle

complex workflows. Both AI types are

transforming the industries. But when

they work together, they unlock even

greater potential. Imagine an AI that

not only generates a marketing campaign,

but also launches it, tracks engagement,

and refine the strategy automatically.

The future isn't just about choosing

between generative AI and agentic AI.

It's about combining them two to build

truly intelligent systems. Now that we

understand the key differences between

these two, let's explore the future of

AI by asking, will generative AI be

replaced? As AI continues to evolve, one

big question arises. Will agentic AI

replace generative AI? Right now,

generative AI is everywhere, helping

people write, design, and code faster

than ever before. But it has one major

limitation. It relies entirely on human

input. Agent AI on the other hand takes

things further. It doesn't just

generate, it decides, plans, and even

acts. It's the next step towards the

true autonomous intelligence. Does that

means generative AI will be obsolete?

Not necessarily. The future of AI isn't

about one replacing the other. It's

about coexisting. Generative AI will

keep getting more creative and even

sophisticated, producing even higher

quality content. Agentic AI will become

even more autonomous, integrating deeper

with industries like healthcare,

finance, and robotics.

But this shift does comes with some

risk. As AI takes on decision-m power,

we face new challenges. ethical

concerns, unintended consequences and

the need for accountability. If an AI

agent makes a bad decision, who is

responsible? And how do we ensure it

aligns with the human values? The answer

lies in balance. The real future of AI

is hybrid approach where generative AI

fuels creativity and agentic AI drives

intelligent action. Imagine an AI system

that not only writes a research paper

but also submits it to generals,

responds to reviews and refine it

automatically. And this is where we are

headed. Not just smarter AI, but AI that

truly works with us as both a creator

and an agent. The question isn't whether

agentic AI will replace generative AI.

It's how we'll harness both to shape the

future of intelligence. Now that we have

explored the differences between

generative AI and agenic AI, let's move

on to building an intelligent AI agent

that can interact with our database

using natural language. This means you

can simply ask a question like show me

all the students who have scored about

80 and the agent will automatically

convert it into an SQL query, fetch the

data and return the exact result from

the database. No need to write complex

SQL queries manually, just ask and the

AI response. Let's dive in and build

this powerful system. First, we need to

set up a cond environment to manage our

project dependency. To do this, we open

the terminal and run the following

command. We'll write create p vv

python equals to

3.10 - y.

So, creates a new environment and hyphen

pvnv specify the environment path as

VNV. Python equals to 3.10 installs

Python version 3.10 inside the

environment and hyphen y automatically

confirms the installation without asking

for approval. Once the process is

complete, our virtual environment is

ready and we can move forward with

setting up our agentic AI project. Next,

we'll create a file named

requirements.txt. txt

where we'll list all the necessary

libraries for our project. This will

help us easily install dependencies in

one go. Additionally, we'll create a NV

file to securely store our Google

generative AI API key, keeping sensitive

information separate from our main code.

With these files in place, we ensure a

well structured and organized setup for

our agentic AI project.

First, we will work with SQLite, a

lightweight self-contained database

engine to create and manage a student

database. Let's break it down step by

step. So, we'll create a file named SQL.

py and import the SQLite 3 module which

allows us to work with SQLite databases.

We'll write import SQLite 3.

This module provides all the necessary

functions to create a database, insert

records, retrieve data, and manage

connections. Next, we create a

connection to an SQLite database file

named student. DB. We'll write

connection equals to SQLite 3.

Connection equals to SQLite3.Connect

in the bracket

in double inverted comma student. DB.

If this file doesn't exist, SQL light

will automatically create it. The

connection object will allow us to

interact with the database. Now we

create a cursor object which is used to

execute SQL commands in Python. We'll

write cursor equals to

connection.cursor.

Think of the cursor as a tool that helps

us send queries to the database and

retrieve results. Now we define a SQL

command to create a table named student

with four columns. We'll write table

info equals to triple inverted commas.

Next we'll create a table.

For that we'll write create table. Then

student we'll write in the bracket name

type vcar and we'll have 25 characters.

Comma class type vcar and the same 25

characters. Comma section type var with

25 characters and marks type integer.

Then we'll write cursor.execute

in the bracket table info. The name

stores the students name string up to 25

characters. The class store the class's

name and the section stores the section

of the student and lastly the mark

stores the marks obtained as integer.

Executing this commands creates the

table in the database. Next, we insert

five student records into the student

table using SQL insert statements. I've

already created and inserted five values

in the table. You can create as much as

you can. Each insert commands adds a new

role with the students name, class,

section, and marks. Now, we retrieve and

display all records from the student

table. For that, we'll have to write

print in the bracket. Print in the

bracket the inserted records are. In the

next line, we'll write data equals to

cursor do.executed in the bracket three

single inverted comma select star from

student closing the inverted commas in

the bracket. Then we'll write for row in

data colon print in the bracket row. The

select star from student query fetches

all the data from the table. The for

loop iterates through the records and

prints them one by one. And finally we

commit our changes and close the

database connection. For that we'll

write connection.

And then connection dot close. The

dotcommit function ensures all the

changes are saved in the database. The

dot close closes the connection freeing

up the system resources. And that's it.

We have successfully created a student

database inserted records and retrieved

them using SQLite and Python. Now let's

build an interactive streamlit app that

converts natural language questions into

SQL queries using Google's Gemini model.

It then retrieves data from an SQLite

database and display the result. Let's

break it down step by step. But before

we start, we have to activate the

environment. For that we'll write

activate venv forward slash.

And here our environment is activated.

First we'll create a file named app. py

and load environment variables using

env. For that we'll write from env we'll

import load env. Next we'll write load

env. It will load all environment

variables. This ensures that sensitive

information such as API keys is securely

stored and accessed. Next we import the

necessary modules. For that we'll write

import stream lit as st. Then import OS.

Then import escalite 3 and then import

Google.generative AI as genai.

Streamlight here powers the web

interface. OS helps access the

environment variables. SQLite 3 allows

us to interact with the database and

Google generative AI enables the

conversion of natural language into SQL

queries. Now we configure the Google

Gemini API key. But before that we'll

have to create a API key through Google

studio itself. I've already generated

one. You can create yours through Google

studio itself.

Then we'll write genai.configure

in the bracket API_key

equals to os do.get env

key. This allows the app to use Gemini

1.5 Pro to generate SQL queries. Then we

define a function to generate SQL

queries from natural language input

using gemi. For that we'll write defaf

get_jemni

response in the bracket question,

prompt. Next we'll write model equals to

genai, generative model in the bracket

we'll write models/jna

version 1.5 pro. Then we'll write

response equals to model.generate

generate underscore content in the

bracket and in square brackets prompt in

the square bracket zero and comma

question and then we'll write return

response text. The function initializes

the Gemini model. It takes a question

and predefined prompt as input and the

AI model generates an SQL query as

output. Next, we define a function to

execute SQL queries on the database and

retrieve results. For that we'll write

def read_sql_query

in the bracket sql comma db. Next we'll

write con equals to skqite 3 dot connect

in the bracket db. Then cur equals to

con.cursor

and then cur equals to execute in the

bracket sql. Then we'll write rows

equals to cur do fetch call. then con

dot commit and then con.t close

and then we'll create a loop by writing

for row in rows and then we'll print it

and then return rows. The function

connects to the student db database. It

executes the given SQL's query and it

fetches all the retrieve records and

prints them. Now we define the AI prompt

that instructs Gemini on how to convert

the questions into SQL queries. As you

can see, I've already created a prompt

for my own and you can create yours

according to how you want your model to

function. If you want the prompt which

I've used over here, you can just

comment on the video and I'll send it to

you. This prompts ensures the Gemini AI

generates SQL queries accurately without

unnecessary text. Now we'll set page

configuration with a title and icon. For

that we'll write st set_page

configuration in the bracket page title

equals to SQL query generator edurea

comma page icon. Then we'll display the

edureka logo and header. For that we'll

write st dot image in the bracket

123.png png comma width equals to let's

keep it as 200

st dom markdown in the bracket logo plus

ederica's gemini app/ your AI powered

SQL assistant

next we'll write next we'll write

st.mmarkdown

then the logo and ask any questions and

I'll generate the SQL query for you the

page title and the icon are set a logo

is displayed at the top and the app's

purpose is to introduce to the user. And

before we import the logo, just make

sure that you have the logo in your

folder. We take user input for a natural

language query.

For that, we'll write question equals to

st.ext_input

in the bracket enter your query in plain

English colon, key equals to input. This

allows users to type their questions

such as show all students with marks

above 80. A submit button triggers the

SQL generation process and for that

we'll write submit equals to ST dot

button in the bracket generate SQL

query. When clicked the app processes

the query and retrieves the result. Now

we define what happens when the submit

button is clicked.

For that we'll write if submit in the

next line response equals to get gemini

response in the bracket question,

prompt. This is to convert the question

to SQL. And then we'll print the

response.

Then we'll write response equals to

read_sql_query

in the bracket response, student db. And

this is to execute SQL on the database.

Then we'll write ST dos subheader. In

the bracket the response is brackets

close.

Next we'll include a loop for then row

in response. Then we'll write st. dot

subheader in the bracket the responses

and then we'll include a loop for row in

response. Then we'll print row and then

st dot header and in the brackets row.

The user's question is converted into an

SQL query using Gemini AI. The SQL query

is executed on the student DB database

and the retrieve records are displayed

on the streamllet app and that's it. The

AI powered streamllet app allows users

to ask natural language questions which

are automatically converted into SQL

queries and executed on a student

database. Now let's open the terminal

and run our streamllet app. To do this,

we simply type streamllet run app. py

and hit enter. It's running. And as you

can see, our agentic AI is up and

running, ready to interact with our

database. Let's test it by asking a

simple question.

We'll ask, give me the names of all the

students. The AI processes our request,

converts it into an SQL query, and

retrieves the student names from the

database. Perfect. As you can see, the

response is generated. Now, let's try

another query. We'll say, give me the

average of marks.

And just like that, the AI calculates

and returns the average marks. The

response which is provided is 72.2. So

in this video we successfully built an

agentic AI that can understand natural

language, generate SQL queries and

interact with our data seamlessly.

Think about this. Instead of you doing

all your work, you have a machine to

finish it for you or it can do something

which you thought was not possible. For

instance, predicting the future like

predicting earthquakes, tsunamis so that

preventive measures can be taken to save

lives, chat bots, virtual personal

assistance like Siri in iPhones, Google

Assistant and believe me, it is getting

smarter daybyday with deep learning,

self-driving cars. It will be a blessing

for elderly people and disabled people

who find it difficult to drive on their

own. And on top of that, it can also

avoid a lot of accidents that happen due

to human error. Google AI eye doctor. So

this is a recent initiative by Google

where Google is working with an Indian

eye care chain to develop an AI software

which can examine retina scans to

identify a condition called diabetic

retinopathy which can cause blindness.

AI music composer. Who thought that we

can have an AI music composer using deep

learning? And maybe in the coming years

even machines will start winning

Grammys. And one of my favorites, a

dream reading machine. With so many

unrealistic applications of AI and deep

learning that we have seen so far, I was

wondering that whether we can capture

dreams in the form of a video or

something. And I wasn't surprised to

find out that this was tried in Japan a

few years back on three test subjects

and they were able to achieve close to

60% accuracy. And that is amazing. But

I'm not sure that whether people would

want to be a test subject for this or

not because it can reveal all your

dreams. Great. So this sets the base for

you and we are ready to understand what

is artificial intelligence.

Artificial intelligence is nothing but

the capability of a machine to imitate

intelligent human behavior. AI is

achieved by mimicking a human brain by

understanding how it thinks, how it

learns and work while trying to solve a

problem. For example, a machine playing

chess or a voice activated software

which helps you with various things in

your phone or a number plate recognition

system which captures the number plate

of an oversp speeding car and processes

it to extract the registration number

and identify the owner of the car so

that he can be charged. And all of these

wasn't very easy to implement before

deep learning. Now let's understand the

various subsets of artificial

intelligence. So till now you'd have

heard a lot about artificial

intelligence, machine learning and deep

learning. However, do you know the

relationship between all three of them?

So deep learning is a sub field of a sub

field of artificial intelligence. So it

is a sub field of machine learning which

is a sub field of artificial

intelligence. So when we look at

something like Alph Go, it is often

portrayed as a big success for deep

learning, but it's actually a

combination of ideas from several

different areas of AI and machine

learning like deep learning,

reinforcement learning, self-play, etc.

And the idea behind deep neural networks

is not new, but it dates back to 1950s.

However, it became possible to

practically implement it only when we

had the new high-end resource

capability. So I hope that you have

understood what is artificial

intelligence. So let's explore machine

learning followed by its limitations.

So machine learning is a subset of

artificial intelligence which provide

computers with the ability to learn

without being explicitly programmed. In

machine learning, we do not have to

define all the steps or conditions like

any other programming application.

However, we have to train the machine on

a training data set large enough to

create a model which helps the machine

to take decisions based on its learning.

For example, if we have to determine the

species of a flower using machine, then

first we need to train the machine using

a flower data set which contains various

characteristics of different flowers

along with the respective species. As

you can see here in the image, we have

got the sele length, sele width, petal

length, petal width, and the species of

the flower too. So using this input data

set, the machine will create a model

which can be used to classify a flower.

Next, we'll pass on a set of

characteristics as input to the model

and it will output the name of the

flower. And this process of training a

machine to create a model and use it for

decision making is called machine

learning. However, this process had some

limitations. Machine learning is not

capable of handling highdimensional data

that is where input and output is large

and it is present in multiple dimensions

and handling and processing such a data

becomes very complex and resource

exhaustive and this is termed as the

curse of dimensionality.

So to understand this in simpler terms,

let us consider a line of 100 yards. And

let us assume that you dropped the coin

somewhere in the line. You'll easily

find the coin by simply walking on the

line. A line is a single dimension

entity. Now let's consider that you have

got a square of side 100 yard each and

you dropped a coin somewhere inside the

square. Now definitely you'll take more

time to find the coin within that

square. A square is a two-dimensional

entity. Now let's take it a step ahead

and consider a cube of side 100 yards

each and you dropped a coin somewhere

inside the cube. Now it is even more

difficult to find the coin. So if we see

that the complexity is increasing as the

dimensions are increasing and in real

life the highdimensional data that we're

talking about has got many dimensions

which makes it very very complex to

handle and process. The highdimensional

data can be easily found in use cases

like image processing, natural language

processing, image translation etc. And

machine learning was not capable of

solving this use cases and hence deep

learning came to the rescue. So deep

learning is capable of handling the

highdimensional data and is also

efficient in focusing on the right

features on its own and this process is

called feature extraction. Now let's try

and understand how deep learning works.

So in an attempt to re-engineer a human

brain, deep learning studies the basic

unit of a brain called a brain cell or a

neuron and inspired from a neuron an

artificial neuron or a perceptron was

developed. So if we focus on the

structure of a biological neuron, it has

got dendrites and these are used to

receive inputs and these inputs are

summed up inside the cell body and using

the axon it is passed on to the next

biological neuron. So similarly a

perceptron receives multiple inputs

applies various transformations and

functions and provides an output. As we

know that our brain consists of multiple

connected neurons called neural network.

We can also have a network of artificial

neurons called perceptrons to form a

deep neural network. Let's understand

how a deep neural network looks like. So

any deep neural network will consist of

three types of layers. the input layer,

the hidden layer and the output layer.

So if you see in the diagram, the first

layer is the input layer which receives

all the inputs. The last layer is the

output layer which gives the desired

output. And all the layers in between

these layers are called hidden layers.

And there can be n number of hidden

layers thanks to the high-end resources

available these days. And the number of

hidden layers and the number of

perceptrons in each layer will be

entirely dependent on the use case that

you're trying to solve.

And there is mechanics to decide the

number of hidden layers. However, we'll

not get into that in this session. Now,

since you have a picture of deep neural

network, let's try to get a highle view

of how deep neural network solves a

problem. For example, we want to perform

image recognition using deep networks.

So, we'll have to pass this

highdimensional data to the input layer.

And to match the dimensionality of the

input data, the input layer will contain

multiple sub layers of perceptron so

that it can consume the entire input.

And the output received from the input

layer will contain patterns and will

only be able to identify the edges and

images based on the contrast levels. And

this output will be fed to hidden layer

1 where it will be able to identify

various face features like eyes, nose,

ears, etc. Now this will be fed to

hidden layer 2 where it will be able to

form the entire faces and sent to the

output layer to be classified and given

a name. Now think if any of these layers

is missing or the neural network is not

deep enough then what will happen?

Simple we'll not be able to accurately

identify the images and this is the very

reason why these use cases did not have

a solution all these years prior to deep

learning. So just to take this further,

we'll try to apply deep network on an

MNEST data set. So the MNEST data set

consists of 60,000 training samples and

10,000 testing samples of handwritten

digit images. And the task here is to

train a model which can accurately

identify the digit present on the image.

And to solve this use case, a deep

network will be created with multiple

hidden layers to process all the 60,000

images pixel by pixel and finally will

receive an output. So the output will be

an array of index 0 to 9 where each

index corresponds to the respective

digit. So index 0 contains the

probability of 0 being the digit present

on the input image. Similarly, index 2

which has a value of 0.1 actually

represents the probability of two being

the digit present on the input image. So

if you see that the highest probability

in this area is 0.8 which is present at

seven index of the array. Hence the

number present on the image will be

seven. So this is how the handwritten

image processing happens. Let me

practically execute this use case for

you. So this is my PyCharm IDE. First of

all, let me show you the data set. So

this is my emnest data set and it has

got four GZ files which gets extracted

when my program gets executed. Now the

program or the deep neural network using

which I was able to create a model to

process all these images and train my

machine is this create model_2.

py and it's a good lengthy program. So

I'll not be explaining the entire

program for you. But let me tell you the

technology or the framework with which I

was able to implement this. So I've been

using TensorFlow which is one of the

open-source Google libraries for deep

learning. And right here I have imported

TensorFlow and then I'm using this Mnest

data and finally going ahead and

creating a deep neural network. So these

all things are here. It is creating a

deep neural network and the hidden layer

that is required to process all these

images. And finally, I'm creating a

model and I'm saving this model with

this name right here, model 2. CKpt.

Now, if I run this code, it is going to

take a very long time. So, give it some

time.

It has extracted all the files and it

has started its training. It is at step

zero now. So in order to completely

train this model, it is going to take

20,000 steps. Let me show you in the

program as well. So here it is. So in

here I've set the steps to 20,000. But

you can always configure it to a number

that is,000 2,000. However, you'll have

to run this code for n number of times

so that you can achieve a particular

accuracy.

So after executing for 20,000 times what

happens is a model is created with an

accuracy of 92%.

So what does it mean? It means that if

you pass a particular image out of 100

images of the model 92 predictions will

be correct. 92 of times this model will

be able to tell you the exact number

that is present on the image. Let's now

wait for this program to execute

completely otherwise we have to wait for

ours. So I've already executed it once

and the model is already created. So

what is a model? A model is nothing but

a set of files and these three files

along with the checkpoint files. Now

there is another code which is

predict_2.

py in which I'm restoring the model and

let me show you the line where I'm

restoring it.

So it's right here saver.restore

model 2.CAP.

So this is the name of my model. So I'm

restoring my entire model that was

created after training of 20,000 steps

on MNEST data set and I'm passing an

image that is 7_o.png.

So it is in this folder in the test

folders. I've got other images and I've

got here 7_o.

Now this is an image. It's the name of

the image. I'm not telling the program

what is the number. So I'll just stop

this training now and now we'll execute

the prediction part where I'm restoring

the model and this model will tell me

the number the handwritten number that

is there in the image. So this was the

image 7 O and the prediction for this

image is 7. So my model was accurately

able to identify or predict the number

that was there on the handwritten image.

So let me change the image now and let

me execute this again just to show you

the image.

All right, there's one good question

that I would like to take. So Akil asked

that how are you saving the weight for

the neural net? Can you show us? Sure,

Akil. So in the previous file that I

executed, I showed you that I'm using a

object called saver. And using the saver

object I can save the entire model and

weights are also automatically saved

along with this model in the checkpoint

file. So using checkpoint we can

actually reach the final state of the

training and then we can use the

prediction model. I hope that is fine.

All right. So if you see this seven,

this is a handwritten image. This is

somebody who writes seven like this with

a strike in between. And now I'm passing

a different image of 7. It's 7_1.

So it's different from the first one.

And I'll run this code again.

Now this time the seven is different.

And my machine learning model should be

able to predict that this is a seven

because people write seven in different

ways. Somebody likes seven like this.

Somebody writes seven and makes it look

like a one. They make the top part very

small. So there are different ways of

writing seven. So however a machine

learning model should be capable enough

to find out that as well. So let me just

close this and see the prediction.

Our model was able to predict the seven

as well and predict the value is seven

as well. So let me execute it again for

you.

So both the sevens were different but

still the prediction is correct for both

of them.

So now let us go back to our

presentation. So after the MNEST

application, let me show you a few more

applications of deep learning. The very

first is face recognition. Let me give

you an example. So all of you are using

Facebook and you do spend some time on

it. So if you remember a few years back

when you used to upload pictures with

your friends, it makes a box around a

human face with a box appearing at the

bottom to ask you to type the name of

the person to tag him. So it was able to

identify that it was a human face. But

now it is able to autotag. It is not

only able to detect faces but also

identify who it is. And how is this

possible? It is only possible using deep

learning. And Facebook also has a deep

learning library called cafe 2 using

which they have applied all these

things. The next use case that is

implemented using deep learning is

Google lens. This is one of those

applications that has been recently

launched for smartphones by Google. What

does this app do? You just have to

install it, open it, point your camera

on a particular thing like this flower

over here and in real time image

processing happens and Google will get

back with the entire details of the

object like the name of the flower where

it is found etc etc. So if you point it

at a building or any shop it will tell

you what kind of a store it is. If it's

a restaurant it will show you reviews

the ratings menu etc. So what is

happening here is that in real time you

are able to use a deep learning net and

get all the information you want and

these applications are really amazing

because it directly brings deep learning

to the end users or the common people.

So they can easily use the benefits of

deep learning without worrying about

what is happening at the background. The

next use case is the machine

translation. This is again a very

important use case and there is also an

app in play store and this is called

translation app. So here is an image

that says more chocolate.

I don't know what it means. I don't even

know which language it is. But with this

app what you can do is that you can

capture the picture of the packet and

this app will first detect the text in

the image then extract the text like

this and then translate it for you. So

for example it has detected the text

extracted it like this and here it has

translated morg which means dark and

then it writes back again on the image.

So what is happening in this particular

use case is that first an image is

captured. Image processing takes place.

Text is extracted through processing and

once we have the text we translate it to

the desired language which is English in

this case and then again image

processing happens where we are writing

the text on an image again. So more

chocolate means dark chocolate. So this

is a really great use case because this

is a combination of multiple learning

algorithms like CNN and RNN.

[clears throat] So these were a few more

applications of deep learning. I hope

that you found them interesting.

[music]

So first thing which comes to our mind

there have been lot of emphasis on this

term called artificial intelligence. So

let's first try to understand that what

is artificial intelligence on a very

high level and why we may need it in

first place for solving a problem. So

let's try to understand with an example.

Person goes to a doctor and he wants to

get checked at whether he got diabetes

or not. And what doctor would say is

okay there are some tests which you need

to get done and based on the test

results doctor would have a look and

from his experience from his studies and

previous examples the patients he had

seen he would be able to evaluate the

reports and say that the patient has

diabetes or not. So if you just take a

step back and think I said the doctor

has experience. So what do we mean by

experience? The doctor has learned what

are the characteristics of somebody

having diabetes. Will it be possible if

we can provide this experience in the

form of data to a machine and let

machine take this decision whether a

person has diabetes or not? So the

experience which doctor learned through

his studies and his practice what we are

doing is we are taking customers data

who have with different reports and

different parameters on different things

like the glucose count in blood or the

weight and height and all these

parameters about a human being and based

on that we have fed it to a machine and

tell that what are the characteristics

of a person who has diabetes and from

this let's say we have 1 million

customers data we have given to a

machine and let machine do this stuff

from his experience which comes from the

data or historical data to be precise

and do the same task which a doctor is

doing. So what we have done is if you

see from this example what an artificial

machine artificial intelligent machine

is doing that it's learning from the

historical data and trying to do the

same thing which an experienced and

intelligent doctor was doing

this kind of area or domain activities

which human beings were doing. If we can

make machine intelligent enough to do

the same task, why we should create

these artificial intelligence

on a very high level. There may be a lot

of points but if you just discuss points

that human beings have limited

computational power and we guys may be

good in terms of classifying things like

you know you can see a friend in a group

photograph and easily can say who is

your friend and who are others. You can

easily listen to a language and

comprehend what a person is doing. But

human beings are not very good in doing

lot of mathematical. If you try doing

good amount of mathematical computations

probably not very easy and second is

that it's not possible for human beings

to work continuously let's say 24 by 7 a

day for 30 days continuously if we can

make a machine do such stuff one they

would be able to kind of do these

computations very fast and like we spend

a good amount of time in discussing the

GPUs and I also mentioned that Google is

talking about a TPU machine transfer

processing units which would be hugely

changing the entire paradigm of

computations and machine would become

more and more competitive or even better

than human beings in some of the fields.

This is the formal definition but if I

loosely translate it's basically that

artificial intelligence machines are

those machines which can do tasks which

human beings can easily do. So things

like identifying what's written let's

say in a license plate or playing games

and I'm sure some of you would already

heard that machine have defeated the go

champion and the chess players. uh now

we have digital agents like Siri and

others which can understand what we want

them to do and can take intelligent

decisions from the text or from the

voice itself. Basically these are very

high level and some of the fields where

deep learning is made great inroads

something like game playing expert

systems self-driving cars robotics

natural language processing so there may

be different and new areas where we are

implementing all of you know that

everything every experience of human

beings is getting digitalized the kind

of things you buy kind of things you

watch and what your preferences are who

you like on Facebook who you don't like

what kind of movies you like and all

these things in terms of reviews being

captured online.

>> So once your data is going and captured

online there are systems which can

analyze this data. So given this huge

data generation as well as now we have

machines which can process it and make

some intelligent decisions are

available. So that's why you will see

there have been lot of emphasis now in

last couple of years lot of new things

are coming in. Some of you who have been

reading these papers on different

subjects, different architectures would

know that most of these architectures

are not very old.

>> It's a very dynamic field. Every day in

fact, on a weekly basis, you will be

hearing about a new API or a new kind of

architecture being developed by

somebody. Most of the stuff we will be

studying in these classes are not very

old like convolution neural network and

recurren neural network. Some of their

variants are as new as as last year. If

you guys follow TensorFlow closely, they

introduced a library called object

identification. Object identification,

object detection API which TensorFlow

has made available for everybody. You

would be able to see to yourself that

this API works. There are five different

options of selecting different deep

learning architectures or convolution

neural networks for this API. But it's

been able to identify human beings and

all other 90 objects there with almost

99% accuracy. In some cases from even

human beings would be finding it

difficult to kind of see and predict

what the object is. But this machine has

gone even beyond a human capability in

terms of identifying sending emails.

Given that we have a fair understanding

or very high level understanding we

haven't got details of artificial

intelligence basically from a loose

understanding that artificial

intelligence of making decisions or

machine making decisions which human

beings were earlier doing the task

something like

language and driving of car let's

understand how this machine learning and

deep learning related to artificial

intelligence given this learning that

now your machine is able to understand

and learn from the data. We can solve

multiple business problems with the help

of this.

>> So let's take a very small example. It

was like whether a person has diabetes

or not. And I was mentioning that this

kind of decision being taken by the

doctor based on the reports he has got

and these reports have some numbers like

number of time a patient a particular

kind of issues or what is the glucose

count and what is BMI and what the

person's a and bas of these numbers the

doctor was able to make this kind of

decision. We can take the same analogy

where we were trying to predict which

species of FL it is. We can take

information of patients and different

attributes on different features of a

report and the patient and the machine

would be able to learn from these data

sets and for a new customer or a new

patient it would be able to classify

whether the patient has diabetes or not.

So there are two sections of it. First

one is the information about the patient

and different characteristics of his

health. So from this which is number of

time and glucose count till age. These

are the information points about the

patients and the last column is the

information whether the patient has

diabetes or not. this kind of problem

where we have some information points

where they explain what the situation is

and other in the last column or the

information of output is in some kind of

classes. There's a specific type of

machine learning problem it is but as of

now the characteristic is that we have

some information about the patient and

the last column is telling me whether

the patient has diabetes or not. So what

a machine basically learns it that it

learns all those rules in the example

which I was quoting that earlier cases

people used to create these handcoded

rules to predict whether an event will

happen or not. But in machine learning

your [laughter] algorithm will learn

from the historical data and see what

are the combinations which decide

whether a patient has diabetes or not.

And these combinations would be of

something of this type. It is only for

illustration. It's not the real numbers

but it's for illustration that your

machine or your machine learning

algorithm has been able to identify

these rules based on the historical

data. So after learning it has created

the glucose count is less than 99.5.

If yes [snorts] then go to next one. If

no the person does not have to face if

glucose count

the person has diabetes and if no then

there are further drill down of rules.

So all these combinations or rules are

dynamic in nature and what I mean is

that these rules would be changing if

your data says changes and you can take

the same model can do the work whether a

patient has diabetes or not and you can

take the same model and make it learn on

a new data set let's say flower species

it would be able to learn the new rule

from itself. So the intuition like human

beings were learning from examples. Your

machine learning algorithms also learn

from examples. But just to frame our

problem statement that machine learning

we know it learn by experiences and from

the data from the historical events

there are three kind of problems which

may be interested in solving. First one

is called a supervised learning problem

and supervised learning problem is

basically occurs when you have some

input variables and one output column.

So both the examples which we discussed

till now one was on the flower species

where we are taking data on different

features of a flower and then which

species of flower it was. So the last

column is the dependent variable or

output variable we are trying to

predict. And all the information

variables are called input features or

input variables.

>> So input features or input variables

it's kind of interchangeably been used

in different input and output. These are

the two different sections of supervised

learning process why it is called

supervised learning. Another take if I

need to explain it we have a column to

guide the algorithm whether it's making

the correct decisions or not. So let's

say your model says person has diabetes.

The actual data says the person does not

have diabetes. So you have some kind of

correction mechanism within your data

itself which can help your model tune

itself to make better predictions. So

this kind of output variable some text

has also been called a teacher variable.

So it's guiding the algorithm to decide

those rules which I just go through.

Another type of machine learning is

called unsupervised learning. Best of

all in unsupervised learning we have

only the input features. customer

enjoyment

>> and our objective is that we should be

able to identify the patterns within the

data itself. So some of you who are

working in telecom domain or marketing

campaigns you would be very much

familiar with segmentation analysis or

cluster analysis where our objective is

to identify coherent groups within the

larger population. We take the customers

as it is the whole population and based

on different parameters and variables we

identify some of the groups of customers

or products whichever the business

problem is to identify which are similar

in nature so that we can take either

marketing campaign or develop new

products for those specific groups.

Final is called a reinforcement

learning. A reinforcement learning is a

kind of learning where the agent learn

from the environment. So it works kind

of reward and penalty. You can think of

a self-flying helicopter. So you leave

it in the environment and it'll be

deciding based on the wind speed and

other parameters in the environment that

how much it should fly and the reward is

that the fuel should be efficiently be

spent and more time it should be

spending in the environment. So

supervised learning as I said that the

objective here is that we have some

input features and input features would

be holding information about different

aspects of a given problem or customers

and we have an output variable which

would be explaining whether the event

happened or not some kind of output

variable. So there are two types within

supervised learning. One is called as

regression, second is classification.

And the differentiation happens only

because of the type of output column. If

your output column is the type of

numerical values or continuous values

like numbers, so that would be a

regression kind of problem, a very high

intuition level. For example, you're

working on a problem where you need to

predict how much would be the sale of

your company given the information that

how much they're spending on marketing,

how many employees are working, which

month it is of the year. And if you have

this information, you are going to

predict what is the million dollar of

sales your company would be doing. So

these kind of problems where your output

variable is of numerical values, then

it's a regression problem. On the other

hand, if the dependent variable is of

categorical nature or of discrete

values, it signifies that it's a

classification problem. Given that it's

a categorical values, your objective is

that how you can put the different

customers or products into different

classes. So that's why the name suggests

classification problem. So as I said

there are two kind of supervised

learning problems. One is regression and

another one is classification. So let's

take a use case where we need to predict

the housing price of a particular

locality. And we have information about

these houses on different parameters and

these parameters are like these. So

let's say what is the crime rate in that

area? How old is the home? The distance

is how far it is from the city. This is

from Boston housing data. So this is if

I'm not wrong it's percentage of black

population or some variable. We have the

description later on and what is the

actual price of the home and all these

features from crime to iset is the

information about the house. So these

are my input features and the output

feature is the price of the house. This

is in million dollars. And our objective

is that we should be able to fit an

algorithm that it should be able to

learn from all the historical homes

which were sold based on all the

features and what was the price it was

sold for. And once the model is trained,

it should be able to predict that how

much should be the price for a given

home. Let's take an example. Let's say

one of you is interested in buying a

home in the Boston area and you would

like to know that what is the ballpark

figure for a two-bedroom flat which is

of some square ft and let's say 20 miles

from a specific location what should be

the price. So one way would be you go

and talk to people and try to understand

that what has been the average price or

if you have this kind of algorithm

available which can help you understand

that given these features that was the

price and if you can create a regression

model it would be able to help you that

given some features of the new home

which you are interested in buying what

should be the price of it. So as I said

there are two sections. One is the

independent variables. All these

information about the home and the last

variable is dependent variable which

would be information about what was the

price. You see this is a kind of scatter

plot between the distance from the city

and price for the home keeping all other

variables constant. We are not looking

at the influence of other features. But

we are looking if you just need to model

or if you need to find a relationship

between the distance from the city and

what is the price from this graph you

can make out that further the city

houses the lower the price would be if

you keep all other things constant. So

here it's like if you can identify this

kind of relationship that's called

linear regression. But for a given

distance from the city you would be able

to predict what should be the ballpark

figure for a house. If you just have

this information not all other

information which we have talked about

in a similar fashion how it's been done

is that there would be a relationship

between the price of the house and all

the features which we discussed. So this

is only a relationship between one of

the variables distance to the city and

the price of the home. In a similar

fashion we would be able to find

relationship between the price and all

other features. So all of the features

if you know like how old is the home,

how big it is, what is the crime rate

and all. So this kind of model is called

a regression model and it's a very basic

equation of a straight line. Y here is

called the dependent variable. A is

intercept and B is called slope and X is

called independent variable. If you go

deeper and try to understand what it's

basically doing is this equation is

trying to tell me that if I already know

the relationship if I know the value of

a and b from my historical data which is

about different homes given the value of

x I should be able to calculate the y in

our particular example is price of the

home. Let me try giving an example what

slope means. So slope is the change

independent variable. If we change x

which is the independent variable by one

unit. So let's say if I change x by one

unit how much change in happen in y.

Help me understand what is the kind of

relationship between x and y. And a is

the value which tells the value of y

when the x is zero. And you can think of

it something like that. If you put the

value of x equal to0 whatever the value

is y then that's the intercept but

basically from intuition perspective you

can think that an intercept is the value

which is there even though you don't

have any information about x. For

example we were discussing the

relationship between the distance from

the city and the housing price. Even

though the house is exactly in the city

then there would be some value and even

though the house is 100 miles from the

city there would still be some price. So

it help us kind of intuitionally

understand the relationship. It also

help the line to understand where it

start whether it start from the origin

or some place within your axis. And this

kind of equation is called equation of

linear regression because if you see

here the power of x is one. So that's

why it's linear in nature. And it is

also that we are fitting the

relationship between x and only one of

the variables. Multiple regression where

what we do is that instead of finding

the relationship between only one

variable and the dependent variable in

most of the practical scenarios the

dependent variable Y is dependent on

more than one features. So for an

example, your house price is dependent

on all these features, all these

information points available and all you

want is that your regression model would

be able to identify relationship giving

all the information together and then

predict what is the price. The equation

becomes y equal to b1 x1 b2 x2 or b3 x3.

this kind of equation where B, B1, B2,

B3 and all these coefficients help us

understand that what is the contribution

of a single or of a given variable into

your regression equation. So let's have

a look that how you can fit this model

in Python. So if you look at the first

block of the code where we are saying

import panda as pen pd, import numpy as

np and import num plot lib as plt. This

is the convention in Python to import

some of the libraries which we'd be

using. So these are the libraries which

are required for running this module or

this this regression model. So once we

import these all the functions available

in these libraries we can call them very

easily and we will be seeing that how

you can call them. So once you have

imported these libraries if you see that

we are loading the data called Boston

and that this is the same data set which

we have been discussing in terms of a

use case here. So next line of code. So

here we are importing the data and

loading the Boston data set. The next

line of code we are calling the pandas

library because we have imported the

pandas library as pd. Then we are

calling a function called dataf frame so

that we can you know create a data frame

in python and we are creating it Boston

data. So it will be creating boss as a

data frame. So data frame on a very

loose term you can think of kind of

spreadsheet kind of format where your

data is being put in rows and columns

and you can think of an excel file kind

of framework for a data frame though it

will be different but just for intuition

purpose and after importing I'm calling

dot head. So what do head does it'll be

giving top 10 rows of my data set. So

there are all the 13 columns in Python

index start from zero. You can see that

index started from 0 1 2 3. So you can

see what this data is. This line of

command which says dot columns. So dot

column gives you all the features

available in your data set. So these are

the different feature names. So these

are the different column names for the

data and the price. In the end of the

code, I have written actually one line

of code which can give you all the

details that what the target variable

is. What was the history of data where

it was recorded and all. So you can

easily look at this. We are calling this

Boston.target and we are calling it as a

boss. So we creating a column in our

data frame which was boss from

Boston.target. So there is another data

vector available in the Boston data set

itself. And now specifically we are

saying y is equal to this particular

variable y we will be representing our

dependent variable all the features plus

the Boston price. So boss dot drop price

x is equal to 1. It means that we are

dropped the price variable from the

overall data frame and xis one specify

that we are removing the column. So x's

0 represent the rowle operations in

python and x's one represent the column

level operations. Print statement we are

just printing now the x. So this x is

all the input features of our data set.

So all these columns which we will be

using for predicting the housing price

and how it will be working. So it's not

an actual model but what actually

happening is once we have created the

model it will be doing that it'll be

fitting a line which would be going

through the actual data set would be

something like this that price is equal

to some intercept term plus b1

multiplied by crime b2 multiplied by

another variable z and then b3

multiplied by another variable and so on

and so forth and this intercept and b1

b2 and bn would be the coefficient which

your model would be learning from your

already available data and here we are

showcasing top five values of our

housing price. So y is the dependent

variable and we are looking at what is

the top five values. So this was only a

very brief and very basic introduction

that how do you import our data how you

can see what are the different columns.

It has nothing to do with machine

learning but it is only for people who

are new to Python and for people who

have been out of touch in Python if just

want to brush up skills and this line of

code if you have a look which I'm

highlighting now it is we are using a

scikitlearn model for test train and

split and what it's doing is for both

because we have already x and y the test

sizes we are saying 33 so it's basically

we are randomly selecting 33% of the

data for we putting it sep separate in

the test bucket so that we can test it

later on. And this random state five

means because we'll be randomly

selecting if you specify a random state.

Every time you run this code, you will

be selecting the same set of elements

from your data set. It help you

understand that the variation if you run

the code multiple times by changing the

variation is not coming because of the

selection of sample. It should be

because of the different model changes

you are making. This dotshape function

in Python specify that what is the

dimensions and if I mention the first

one X train.shape is giving me 339 and

13. Basically it's telling me that there

are 339 rows and 13 columns in the data

set. X test there are 167 rows and 13

columns. And your Y test is just 339

rows and there's just one column or it's

just one vector. The number of rows in X

train and Y train are same. In X test

and Y test the number of rows are same

because they have been selected for the

same combination. So same houses we have

selected the input features as well as

the corresponding values of output and

same has been done for the test section.

What we are doing here as I was saying

that we have imported a library called

scikitlearn and scikitlearn has

different modules for different machine

learning algorithms and linear

regression is one of the modules in

scikitlearn. So we can call this

scikitlearn module from linear

regression called lm and this equal to

and in python it's called assignment

variable. So we are assigning LM as a

linear regression module in the

scikitlearn. And now what we are doing

is if you look at this line only lm.fit.

So basically we are telling that use the

linear regression module from

scikitlearn and fit the model between x

train and y train. So basically what we

are telling the model that you learn

those coefficients for different x

values given y values in the training

data set. Basically what the fit

function does it calculate the values of

your intercept B1 B2 B3 for all the

features in your input features for a

given Y variable. So once we have fit in

the model and once that has been fit we

can use the same model for doing the

predictions. So let me remove it. It

should be like this. So lm.fit fit we

have fit in the model and once there has

been fit we can use the learned model

which is lm with a function called

predict x train. So what it will be

doing is once it has learned those

coefficients B intercept and B1 B2 B3

for all the features and input you can

use the predict function for making the

predictions for your training data set

and you already have actual values as Y

train and then you can compare that how

good your model is doing and how you can

compare it that if you look that we are

put together same thing we have used for

the X test data set LM.predict predict X

test and again the prediction has been

done. So if you see here I have put

together as a data frame Y test and Y

test bread and the difference look like

this for the first value which was 37.6

and the actual value was 37 this value

then the predicted value this actual

value is this and this difference

between the actual value and the

predicted value signifies that how much

is the error in your data set. So had it

been that your model is giving the same

prediction as it was the actual value

you would say there is no error. Your

model is 100% accurate and all the

predictions being made by the model is

absolutely you know bangon but normally

doesn't happen. You end up having

predictions which are a bit off from the

actual value and we measure the

difference as one of the characteristics

or one of the parameters to identify how

correct your model is. In this

particular statistic there are two

metrics being used for identifying but

the most uh basic one used is called

mean squared errors. And what mean

squared error is it is basically the

difference between actual value and the

predicted value by the model. And what I

mean by this is that let's say this is

your predicted value 37.6 and this is

your actual value. What you do is you

take the difference of these two and

then take the square of it. Why we take

the square of it? Because this in some

values the difference may be negative or

positive and if you sum it up the

difference may come to zero and you may

end up thinking that okay model is doing

really good stuff. To avoid it, what we

do is that we take the square of it so

that the difference between actual and

the predicted becomes positive and you

can sum it up to showcase that how far

your predicted values are from the

actual value and then you take a mean of

it to showcase that what is the mean

difference between the actual value and

the predicted value. It can also be used

for model comparisons. Here I can show

you that how it is working that let's

say you have some actual values

something like this and let's say you

fit a model I fit a model so there is

one model prediction one another model

is prediction two so what you can do is

you can take the difference between the

actual and the predict so 10 minus 2 is

2 and then you take the square of it

which is four 23 and 21 again two square

of 2 4 then third one the difference is

five and then square of is 25 and so and

so forth for all the values You get the

total value of sum of squared errors and

you divide it by the number of inputs

which is five and you get the mean of

the squared errors and you do the same

thing for the second one and if you see

it is very less five. So probably it

would be able to help you understand

that which model is doing a better job

in terms of predicting the housing

prices or any other numerical variable.

And there is another statistic which has

been used for identifying how good your

model is which is called mean absolute

percentage error. And that's basically

the absolute difference between the

actual value and the predicted value

absolute terms. and you sum it up all

the values for all the entries and

divide it by number of all the value of

absolute value of your actual values and

it can help you understand that what is

the average percentage your predicted

values are different from the actual

values. So sometime if you see it will

be somewhere in percentages. So what

I've done is I've taken the absolute

difference between this value and the

predictions I sum it up and divide it by

sum of my input values. So whatever

value comes in you can say okay it's 5%.

So it'll be fair to say that your model

is 5% off from the actual values or the

error term in your model is let's say 5%

or 6%. And whatever predictions you're

making from the model you can keep a

buffer of that percentage when you share

it with the team. And what I mean is

that let's say your mean absolute

percentage error is around 10% and it's

about the sales of a company. So when

you share this forecast you say that my

predictions are around 90% accurate they

may be actual sales may be plus - 10

percentage. So this can help you giving

this kind of variability in your

predictions. However you have

implemented code in python itself or the

scikitlearn library you can call mean

squared error the function from skarn

and it can help you calculate the MSC

for a given model. So basically there

was a very quick introduction to linear

regression. Though there are different

applications but one thing remain common

that we are trying to predict the

dependent variable whose nature or the

type of dependent variable is a

numerical or continuous data. Some of

the applications like predicting life

expectancy based on these features like

eating patterns, medications, disease

etc. You can predict housing price. We

have already seen the example on that.

We can predict the weight on different

features like sex, weight, prior

information about parents and all. And

you can also predict the crop yield of

crop based on different parameters like

rainfall and all. And as I said, this

like very limited uh use case uh list.

I'm sure people who are working with

sales department, you have to make

predictions how much would be the sales.

People who are working with call

centers, you need to predict what would

be the number of calls for next month.

people who are working with marketing

you need to predict what would be the

footfall in a given company or a mall.

So there are different applications of

regression models but one thing is

common across all these applications is

that the dependent variable which we are

trying to forecast is of continuous data

type. So let's get moving to the next

agenda for logistic regression. So at

the time of the introduction to machine

learning, we discussed there are two

kind of supervised learning techniques,

one is regression and other one is

classification. And the major

differentiating factor between the two

were that in regression we held a

dependent variable of continuous values

and in classification problems we had a

dependent variable of categorical types.

So let's take an example of how we can

do it. So here let's take a use case

where we have got some information about

some customers and the data set looks

like this that we have some customer ID

or user ids gender of the customer or

the user his or her age estimated salary

every month so you can think of in any

one of the currency either INR or

dollars and whether this user purchased

an SUV or not. So as I was saying

earlier the dependent variable here is 0

and one. So it's a discrete value or

categorical value which we need to

predict and the features which we'll be

using in the model are age and

estimation.

Why we would need a logistic regression

kind of algorithm? It would be a

straightforward process that if I take

purchase as a numerical value 0 and one

and I take some input features like age

and estimated salary and you will be

right in saying to some extent that this

is a possibility of doing it. So there

are two major problems coming if we

follow this and some of you can help me

what may be the problems if I try using

the linear regression for solving this

kind of problem. But one limitation I

can think of is that here I'm looking

for an output which can give me some

kind of [snorts] probability that how

likely I am to buy a product or service.

So one thing the limitation or the

restriction with probability is that the

probability term should be between zero

and one and zero signifies that there is

no probability or there is no likelihood

of event happening and one means that

it's certain that the event will happen.

There is no possibility that we can have

probability values less than zero or

greater than one. So if I'm fitting a

linear model taking the purchased column

as my dependent variable, my values

because the linear regression has no

such limitations can pass these values

beyond one or less than zero. So what I

require is that I fit the model in the

similar fashion like I did the linear

regression the equation I used earlier.

that I want that information would be

coming from my features in the similar

fashion. But what I want actually is

that this y should be mapped to the

values between 0 and one. And given the

limitation we have just talked about

probability that it should be between 0

and 1. It should not go beyond one and

less than zero. I need to find ways if I

can kind of force fit or kind of force

this y value which would be coming from

this equation and I force fit into a

values between 0 and 1. So to solve this

problem there was a function called

sigmoid [snorts]

activation function which would be

extensively being used in our uh deep

learning as well at different places.

But logistic regression comes from this

activation function itself which is a

function looks something like this that

output value would be 1 / 1 + e ^ - x

and x is not actually the one of the

input but any value we are giving it.

And if you fit any value into this

particular equation it can convert any

value between minus infinity to plus

infinity. It will map it to between 0

and 1. If your value of x which you are

putting in here, I could have selected a

different value, different name at

least. But if you give the highly

negative value, the output would be very

very close to zero. If this input is

positive, then the output would be close

to 1. If the value of x the input here

becomes zero, what would be the output?

1x2 because any values power 0 is equal

to 1 and 1 / 1 + 1 would be equal to 1x

2 or half. So logistic regression is

nothing but an extension of your linear

regression itself with one additional

fact that you want to force fit your

output between zero and one and for that

you are using activation function called

sigmoid activation function or sometime

it's also been called logistic

activation function to do the same task.

So this is an intuition behind your

logistic regression where you take

values of your equation from intercept

and different coefficients for your

input and you map these outputs between

0 and one. So once we have understood

that logistic regression is nothing but

the extension of your linear regression

only with a restriction on the output

being mapped between 0 and one. We are

shifting had we are fitting the

regression equation we would be having

scenarios where the value would be going

beyond one or less than zero and to

avoid this scenario we fit in logistic

regression with the help of sigmoid

activation function which looks like

this and if you see as I was saying when

your value of your model go beyond let's

say this is R0 so all the values which

are positive and greater than zero the

curve goes and tangential towards one

and for all the values which are less

than zero it goes towards zero and at

the place of zero the probability is 0.5

so it's a 50% probability if your output

is very much close to zero or it's zero

it can be used for multiple scenarios

one of the example we are taking is the

example whether somebody will buy an SUV

or not but if you're trying to solve

problems like somebody will say yes or

no to a product or service or whether

something is true or false or high low

or any different categories theories but

logistic regression can easily be put in

for multiclass classification problems

and basically if I just give you a very

quick introduction how it works is that

in multiclass classifications it kind of

does mapping that one class versus rest

of the other classes and then same

analogy follows that which class a

particular event would be associated

with but end of the day for whichever

class or category the probability is

highest the model will predict that uh

it should we belong to that particular

class like MSC. We have a statistic or a

parameter to evaluate how good your

model is doing and that was a parameter

to check that what is the difference

between the actual value and the

predicted value and how we were doing

it. We were taking the value which was

actual subtracting the predicted value

taking square of it. Do it for all the

examples and divide it by number of

training examples we have and then it

gives you some number and I was also

saying that this number is helpful in

kind of comparing different models. So

let's say you fit a model, I fit a model

and we compare MSC for both of them.

Whichever model is giving me a lesser

MSSE, it is kind of an indication that

probably your model is doing a better

job in terms of prediction than mine. In

a similar fashion, we needed a kind of a

statistic to see how good your model is

doing when your model is doing a

classification problem. So here there

are four categories that let's say we

have only two classes good and bad

actual values good and bad and what your

model is predicting good and bad. So

four examples which belong to good

category and your model is also

predicting them good category. So this

type of events or examples are called

true positives because your model is

doing correct prediction on positive

examples. Another category which is your

actual value for those examples is bad.

They belong to bad category and your

model is also predicting them bad. These

are called true negatives and these are

correct predictions because whatever the

actual value is your model is also

predicting the same thing. However,

there are two categories where your

actual value was bad but your model is

predicting good. These kind of examples

are called false positive because your

model is falsely predicting then these

are good examples. And another category

or last category is called false

negative where actual value were good

and your model is predicting bad. So how

do you learn or how does your model say

that which model is doing good job. So

what we do is we calculate what is the

percentage of values examples have been

predicted correctly. These sections in

blue true positive and true negatives

these are the examples which your model

has been able to predict correctly. And

these two groups false positive and

false negative are the incorrect

predictions. So what we do is we just

want to take what is the percentage of

correct predictions. And this matrix is

also called confusion matrix.

Let's say there are some examples out of

which 65 examples were there where

actually they were good category

examples and your model is also

predicting them as good class good

category examples. 44 are those where

they belong to the bad category and your

model is also predicting them bad. This

is 44 and eight are uh actual bad and

prediction is good and four are actually

good and uh predicted bad. What you do

is you sum up all the correct examples

64 and 44 and divided it by all the

examples in your data set all correct

and incorrect ones and here you get 89%.

So all you can say that your model is

being able to predict 89% accurately or

if you want to explain it to your

business team and say that whenever I

give you a prediction that uh 100

customers will churn and I give you a

list of 100 customers I can say with

certaintity that at least 89 will churn

from them with some certaintity. So

because your model has given you 89%

accuracy. So that that's how it's been

kind of communicated to business teams

that we are thinking that our model is

99% accurate and whatever prediction we

are giving we are very very certain but

if your model accuracy is 70 or 60% then

when you give the predictions to your

business team you say that okay though

we are giving you the predictions but we

are not very certain whether it'll work

correctly or not. So this accuracy

percentage is in a similar fashion like

we did for linear regression as MSC to

identify how good the predictions are.

Your accuracy percentage is another

metric to see how close or how correct

the predictions has been. So now we can

see the implementation of logistic

regression in Python. So first few lines

if you see we are importing the

libraries or the machine learning

libraries which we require to do the

data manipulations. We are importing the

data which is a CSV format and this data

is already available on your LMS. If you

want to import you can easily import

from the LMS itself unlike the Boston

data set which we were importing from

the library itself. Here we have got a

flat file as social network ads.csv and

you can call read csv function of pandas

library to import the data. So you are

importing the data as data set and as I

said head showcase the top five rows of

your data. So here we have only five

columns. One is user ID, gender, age and

salary. And the last column is our

dependent variable which signifies

whether a customer or a user bought the

SUV or not. So it's 0 and one and one

means the person bought. In the previous

code, we used one convention of

selecting X and Y. Here we are

showcasing another way of selecting

that's called eyelock. So we are looking

for the location and this convention if

I go through what we are doing here that

this is the data set within the data set

we are specifying the locations. This

colon means that we want all the rows

and as I was saying earlier that in

Python the index start from zero. So

what we are saying we want column 2 and

three. So what we mean that this is zero

this is one this is 2 and three. So we

want as our input features two and three

and the values. So it will be creating

an array of these two columns. We could

have used gender but I will leave it to

you that first we need to create the

gender as a vector of 0 and one. So you

can create a function which will say

okay if gender is equal to male then one

else zero or you can create dummy

variables there are function available

in scikitlearn. So it's an exercise for

you that this is the code already

available but I would encourage that if

you can also include gender information

into your model. The next line of code

why we are saying the dependent variable

is all the rows and column number four.

So column number four is your purchase

information whether a customer bought

the SUV or not and again the values to

convert into a kind of list format. So

we have specified two things. The two

columns is the information about the

input features or the information about

the user in terms of how much money they

make and what their age is and

information of why whether a customer

bought the SUV or not. And the next line

we are doing the train and test split

for the same stuff to evaluate whether

the predictions been made by the model

on the training data set on which the

model was learned is still doing the

correct classification on the data set

or the test data set which was not

involved at the time of training and

this 0.25 means that we are selecting

25% of the data for test and remaining

75% for our train. What is the correct

split of train and test? Normally it is

correct to choose between something like

6040 or 7030 or 80/20. If your data set

is big enough then I think having 80/20

kind of split is good or whatever you

can try these different combinations but

as a rule of thumb most of the time I

have seen people taking something like

6040 or 7030 kind of distribution

between actual value and the predicted

value. Now there is one important thing

for data prep-processing and this

selection [snorts] which I have made for

doing the data prep-processing and some

of you who come from the machine

learning background will already know

that how important it is to kind of

scale your data and what do I mean by

scaling that if you look at the data set

which we are using for input one is the

age column and second one is the income

column age can be somewhere between

let's say 1 to 100 or 120 20 at max and

your income is in like some thousand and

some 100,000 numbers. Both these values

are on different scale. Scaling your

data on let's say all the values between

0 and 1 will help me understand that

what is the importance of each variable.

For for example, if you look at a

regression equation and you see those

coefficient B1 B2 for all the input

features, these feature or these

coefficients can give you kind of

indication that how important a

particular variable is. But this

intuition will only be correct if all my

features were on the same scale. If

these features like age and salary when

they are on different scales, you will

not be able to compare what these

coefficient really mean because there

are two different scale your values come

from. So it is always a good idea to

have all your features on the same

scale. There are multiple ways of doing

it and there are multiple type of

scaling parameters. The simplest one is

called minmax standardization. And what

does it mean is that for a given column

let's say we are talking about age

column which is 19 35 26. If I need to

do minmax standardization what do I mean

is that I take the value it is let's say

19 and minimum value here is let's say

19. I have only these five values. So

how it works is that this is the

formula. This is the value or how it's

being presented. X I minus the minimum

value of the column. So let's say age

divided by max of age minus minimum of

range as well. But basically what this

formula will do if I do it for all the

values in the age column, it will be

converting all the values between 0 and

one. And there are other ways. I also

said that there are normalization

process which is like you take the value

minus the average value divide by the

standard deviation if I call it

correctly. So whichever method we apply

all I saying is that these values of age

and salary should be brought to the same

scale. So if I'm applying this minmax

standardization I'll apply to both my

columns so that both these variables are

on the same scale and I should be able

to use them in my model. And this is

again a very important thing that

whenever you do standardization you will

be using this process that you fit the

normalization or standard scaler on the

train data set and you use the same

learned standardization from the train

on the test data set. But basically how

it helps that your data set on which

your model is being trained it will be

converting the values between 0 and one

based on the minimum and maximum values.

If the test data set have different

minimum and maximum values, it can have

different value for the same number. So

that's why the process is that we make

our standardization fit on the training

data set and use the same minimum and

maximum value for test normalization as

well. It gives the same scale for all

the values and for model predictions.

It's very helpful. In a similar fashion

like we called a linear regression

object from scikitlearn in the previous

example in exact same way we can call a

logistic regression function from the

scikitlearn. So it's a scikitlearn

linear model and we are importing

logistic regression. Now um we are

fitting the logistic regression between

x train and y train the same way we did

it earlier for the linear regression.

And once it has been fit we can do the

prediction for test data set. We are

also doing the same thing that we are

calling the function which was

classifier for the logistic regression

and we doing it on the X test data set.

And here the default probability is 0.5.

So what your algorithm is doing in the

back end for all the examples wherever

the probability in X test became greater

than.5 it was tagged as one and for all

the examples where probability was less

than equal to.5 it was tagged as zero

and now we are calling this function

called confusion matrix between y test

and yred. So we are comparing that what

is the values of your true positive true

negative false positive and false

negative. So this is the values that

these are the true positive these are

true negative and these are the

mclassification values and if you want

to calculate the accuracy you can easily

do it by 65 + 24 divided by 65 + 24 + 8

+ 3. So all we are doing is we are

trying to identify what is the

percentage of correctly predicted

numbers and uh this is the code of

section. So it can do the prediction. If

you see what it has done, basically if

you look at the section that your

regression model has fit this line and

you can see it's a straight line and

that is why in some of the text logistic

regression is also being called a linear

classifier. And why it is called linear

classifier because it is predominantly

being made for fitting a linear

equation. The logistic regression

equation was y= a + b1 x1 b2 x2 and all

these coefficients and respective

inputs. But the highest power of your

inputs were one and you would already

know that if it's a polomial of power

one it stand for a straight line. So

that's why you can see a straight line.

There are ways some of you would argue

that you know we can fit a nonlinear

line with the help of logistic

regression. But you would also concede

that there are some tricks which we use

for creating nonlinear lines through

logistic regression. For example, you

introduce higher order polomials into

your model so that the separation

becomes nonlinear. These kind of

algorithms are really helpful only

solving the problems when the objects

are linearly separable. When the

separation between the objects is not

linearly separable, these kind of

algorithms are not very helpful and we

need to identify algorithms which can

fit in nonlinear hypothesis or

separation boundaries between different

classes. So let's take a use case to

understand that what are the simple

scenarios where unsupervised learning

can be used and how does it really work.

So let's take an example that we have

some housing data and housing data in

terms that what their locations are and

these white dots on the screen in the

blue background showcase that where

these homes are located and the

objective of education officer is that

he needs to find a few locations where

the schools can be set up and the

constraint is that student don't have to

travel much. So given this constraint in

mind the officer needs to decide the

location. There may be easily we can

identify if we are not using any

algorithm. So let's say if I know that

I'm an officer I need to open three

schools in the locality and I know the

information where the homes are located.

I can easily see okay probably this is

one location I'm just highlighting it

and the constraint I also mentioned that

student don't have to travel much. What

I mean is that if you open the school

here then everybody of you would say

that it's not a great location for a

school given that it's far away from the

population. So this is not the correct

location and from the perspective of

identifying the home probably these

three from a human intervention or or

like some of you has been given the task

without any algorithm you can decide

that if you set up the school most of

the students would be traveling less to

go to a school. So given this problem we

can easily see that we don't have a

dependent variable as such which is

telling us whether it's the correct

location or not. All we're doing is that

we have a number of locations which we

need to find schools for and then we

have home locations and based on the

distance of each home we need to

identify which be the proper distance

proper locations of these schools and

another thing which is coming from the

same logic that there are no predefined

classes of these locations. And one more

point if you would like to add and some

of you who have done the clustering or

the segmentation job in your respective

works that these numbers we say three or

four or five it's not predefined it is

most of the time given by the business

that how many clusters or segments they

would be looking for though there are

statistical ways of identifying that

which is the best number of clusters

should be but basically most of the time

it would be coming from somebody in the

business that okay I see that let's say

I was working for one of the Indian

telecom companies here quite a time back

and at that time their subscriber base

was around 300 million customers and

imagine that if you're trying to create

segments for this big a population and

if you create three or four clusters you

can easily understand that it would be

very difficult for marketing team or any

product team to design products for such

a big population. So though

statistically it may look that okay four

or five unique segments are there but

you end up creating lot of small small

segments and there may be a possibility

that you will be creating 20 or 30

segments for such a big population. So

my intent of saying this number that we

trying to identify three locations

within the population has to be decided

either by business or people like you

who have knowledge about the data as

well as that what kind of business they

are running and what is the final usage

of this segmentation exercise. So let's

see one way of doing our selection of

these school locations is like we have

already doing it. If you identify that

somebody looked at the homes and see the

densities where the density is high and

selected the home automatically but

there are algorithms also available to

do the task and I can give the name here

itself it's K K means algorithm and so

first we would like to understand that

how does an algorithm work if it needs

to identify which is the best location.

So if you're looking at my screen, let's

say our objective is that we need to

identify two locations first and we have

some data and it's scatter plot

available and we need to identify where

the school should be so that the

distance from home should be minimum if

that's our objective. So how we can do

it that let's say we randomly assign two

points from the existing data set and

actually easiest ways that you randomly

pick two numbers from your data set

itself and then what you do is you

assign these two selected points as

these are your cluster centroid. So this

is the center of your selected

population. And in the second step, so

once you have initialized these two

random points, then the next step is

that you measure the distance of all the

homes from the initial selected point.

So let's say you do the distance of this

home from this selected point and again

from this that for each house from these

randomly initialized point we measure

the distance from the selected point or

the initialized point and any home

location and see which distance is

minimum or which distance is less in

comparison to the other from the

selected initial point. So we can easily

see that this distance is smaller than

this distance and this point would be

assigned to this particular group. The

first initialization step is initialize

as many number of centroidid as many

clusters you need and in the second step

you do the cluster assignment and in

cluster assignment how it's been done is

that you measure the distance from these

initialized points and see wherever the

distance is minimum and then assign this

home to that particular segment or

cluster. So this exercise has been done

for each home. I'm just trying to show

for a couple of them. And based on the

distance, the assignment is complete. So

this color also signifies what we have

done is after measuring the distance for

each home from the initial points, we

have assigned these points to this

cluster and these blue points to second

cluster. And then once this assignment

is complete, it moves the centrid. So

what it'll be doing is it'll be taking

the center of all the selected points

and then it'll move the centrid from the

previous point to the next point based

on the new assignment which has already

been completed. And then what's been

done is the same exercise which was done

earlier in terms of cluster assignment

that we measure the distance of each

home from the centrid. So distance from

this centrid and this centrid and

wherever which minimum assign it to that

particular cluster and this process has

been repeated again for both the centrid

and once the distance has been measured

on the improved or changed centrid again

the assignment process has been started.

So once you have moved and then measure

the distance and then assignment also

changes like it was done in the previous

step and we continue this process till

the time we have reached a location or a

point where this change in assignment

have stopped completely. So once we have

reached this kind of place or this kind

of scenario where as many time you

measure the distance from the centrid to

the different points your centr does not

change. This exercise or this point is

called that your model has converged and

at that point you can say okay these all

group there is one group of these points

or these homes. So this is one cluster

and second one is this cluster. So this

is how K means work. It has wide variety

of applications. There is a function

available in scikitlearn library. You

can try implemented it. The intent of

showcasing you this example of

unsupervised learning was that we will

be having two algorithms which come from

unsupervised learning section of uh

machine learning and these would be your

restricted boltsman machines and

autoenccoders which work on a similar

methodology of unsupervised [snorts]

learning. So in a similar fashion like

we started discussing in the beginning

that where should be the location of

these schools. We can use a key means

algorithm and initialize three points

randomly and do this distance measure to

each home and assign the homes to a

cluster wherever the distance is minimum

and we continue this process of

measuring the distance and assigning it

to the cluster till the time these value

have been converged. The most important

task for any data scientist is not to

remember which library is required or

what are the codes. In my understanding,

the most important thing which data

scientist should remember is that once

you've been given a business problem,

first you should be able to understand

that what kind of problem it is. Whether

it's a problem of supervised learning or

it's an unsupervised learning. Given

it's a supervised learning problem,

whether it falls into the regression

type or a logistic regression type. If

you can make these decisions then for

implementing the algorithm you will find

lot of help. In fact scikitle learn

would have initial codes for almost

every algorithm. So you don't have to

remember line of code and algorithms.

All you should be able to do is once the

problem has been given to you should be

able to identify what kind of problem it

is. Most of the time in unsupervised

learning and specifically in C means

kind of models, we use this elbow method

as an indicator or help you understand

that what is probably a number we should

start with for starting the final

implementation of your model. So let me

give you an intuition how does it work?

SSD stands for sum of squared errors.

And what it means is actually if I go

back a little that suppose you have

identified these two clusters. So sum of

squared error would be that you take the

centrid and measure the distance for the

points which are associated with this

cluster. So you measure the distance for

each point in the orange group and sum

square all the distances and the same

exercise being done for the blue points

and whatever the total number comes in

after doing this exercise you will be

getting what is the total number of

squared errors and if you have two

clusters you would have some number and

just for intuition I'm saying that this

total sum of square is coming as 100 and

that's only for intuition and example

I'm taking this number to help you let's

say there was one more cluster somebody

identified here and all these three

points though it's it's blue in color

but I'm saying all these three point

belong to this particular segment and

rest of these points remained same as it

was previously and as we saw with two

clusters our sum of square was coming as

100 when we have three you can see that

these points are bit far off from this

particular cluster so if I'll be doing

it with three clusters this distance

would be a bit less given that now I

have a point which is closer to these

points And whatever error or distance

these three points were adding it would

be bit less given the cluster was here.

And let's say this distance goes down to

95. I'm just making up some numbers. So

probably what it is telling me that sum

of squar is going down and probably I'm

finding clusters which are closer or

more closer to the actual data points.

And as you would know that if I'll be

increasing the number of clusters in the

population, this distance would be going

down hopefully. And this distance can go

up to zero when every point become a

cluster itself. So if let's say I have

20 data points there and I assign that

every point is a cluster in itself then

just measuring the distance from the

point which would be zero and overall

SSD will become zero. So it may start

from a very high number but it will be

reducing with each cluster point or

cluster you will be adding to your data.

So this line which has sometime been

called the elbow method what it's

actually showing you. So if you had one

cluster only anywhere in the population

and you do some of the squared distances

this was the distance when you had two

clusters this was the distance when you

had three these were the distance when

you had four this was the distance but

when you had five the sum of squared

error did not reduce much. So if you see

it's like very less and after that even

though you keep on adding different

clusters the sum of squared errors is

not going down. So as I was saying this

process or this method is kind of

indicative method and it gives you an

intuition that if I have done it my

cluster analysis with different number

of clusters and I'm measuring the sum of

squared errors for given number of

clusters and I see that after four that

the sum of squared error is not going

down. It gives me an induction that

probably I have found clusters which are

more or less coherent and the population

is not very much away from the centroid.

From the point you can make it an

assumption that probably four clusters

is a good idea for my given population.

But as I also mentioned that it is just

a indicative process. It's a good

starting point. But you need to see that

how the distribution of your clusters

look like whether they solve the

business problem you're trying to solve

or not. and if not whether you need to

further divide the clusters which your

initial model has identified. And here

we'll be taking very quick introduction

to a third type of learning which is

called reinforcement learning. What it

actually is we have seen from the two

learning types the supervised learning

and unsupervised learning. The first one

was that we are trying to predict some

dependent variable. In the second one,

we are trying to identify some kind of

structure in the data set or if I put it

into other words that we are trying to

identify some kind of coherent groups in

the population. Third one is

reinforcement learning and it's

basically that an object or a system

learns from the environment and there is

no right or wrong answer given to the

system explicitly or in the beginning

itself like in the case of supervised

learning. Here the object would be

moving in the environment. Self flying

helicopters where they fly on its own

and they take the decision that what is

the wind speed and what is the pressure

around it and they correct their

procedure accordingly and the objective

they need to achieve is that they need

to fly for a longer period of time. So

here we are given an example. Let's say

we have a robotic dog and somebody needs

to train it to take correct decisions

and correct things would be that it

walking on the path where people needs

to walk and it's not going down from the

path and if some task is being given

it's working correctly. So there are two

components of reinforcement learning

which is called reward and penalty. If

the object or the system does the

correct thing it receives some reward in

terms of you know mathematical things.

Obviously we will be providing

everything in terms of mathematical

numbers and if it does the wrong thing

it receives a penalty and basis this

thing it'll keep on taking its

decisions. So like a dog if it's walking

correctly it receives the points like

ball is being thrown if the robotic dogs

go and pick it up it's a reward point.

If it doesn't do the correct thing it

receives a penalty. So most of like all

these reinforcement learning agents

working in a similar fashion. Some of

you who are interested in implementing

it, there is an algorithm called deep Q.

It's an algorithm where you can design

your own system and you can assign what

are the rewards and penalty. Similar

fashion, reinforcement learning is also

interacting with the space as I

mentioned. So self-driving car is also

one of the examples which would be

receiving rewards and penalties based on

whether it's running on the track,

taking the right turns and moving at the

correct speed, maintaining distance from

other cars which are running. So

reinforcement learning has a huge

implementation or requirement for

self-driving cars or some of the

components of it not all some of the

components also in self-driving car are

supervised learning for the point that

car needs to understand what the objects

are in front of it and all other objects

identification. So what are the real

limitations of machine learning? Given

that we already have all three type of

algorithms supervised, unsupervised and

reinforcement learning algorithm already

available. Then why we want a new

architecture or new type of algorithms

for our artificial intelligence systems.

First and foremost is the dimensions.

And when I say dimension, it's like the

type of data we get from lot of sources.

Let's say we receive images which is

gridlit like image. So where are the

pixels and what is the strength of

pixels in the image natural language

processing. So language data comes in a

different length and you know the work

is also different in the sense that

suppose you need to design a machine

learning algorithm which can do language

translation and if you conceptualize

this idea of language translation from a

machine learning algorithm perspective

your inputs become a sequence of words

and your output is also sequence of

word. And some of you who are working in

machine learning algorithm try thinking

that whether we have any algorithm

currently available like logistic

regression or decision tree which can

help me even fit the algorithm or fit

the problem. Leave aside how good the

accuracy would be and all but these

problems which come from a different

type of data source and we trying to

solve a different kind of problem like

language translation or chatbot kind of

problem where you give a sequence and it

returns you a sequence. So these kind of

architecture is already not available in

machine learning. So that is one of the

reasons that we need to identify some

algorithms which can deal with such data

sets like images and languages. And

second, it can also fit different kind

of models which are not only for

predicting or classifying but also give

you some kind of values like sequence I

take an example of. So that is one of

the reasons first we are looking for a

different type of architecture for

solving such problems. Then second

problem which machine learning

algorithms are not very good in dealing

is the dimensionality. So we would have

seen with a size of let's say thousand

variables and let's say 100,000 rows and

thousand columns probably you can still

fit some of the machine learning

algorithms on top of it. But given the

kind of problems we are dealing with

like images every image let's say it's

uh 200x 200 means 200 pixel by 200 pixel

and it's a colored image it means there

are three channels if you do the math

200 multiplied by 200 let's say this is

your image and it's 200 by 200 because

every image is kind of a matrix only and

if it's a colored image actually colored

image are being represented in system

through three channels red green and

blue so there would be three such grids

but one top of the other. So number of

pixels you need to have to represent

your image in the system or in your

algorithm would be 200x 200 by3 and then

you calculate how many features it would

be. If I if my math is correct it should

be like 120,000 features. So even a

simple image of such small dimensions

you end up getting 120,000 features. And

plus if you're really working on a

complex problem solving in terms of

let's say an object identification in

the images there may be five or six

objects which you need to identify and

you're dealing with let's say 100,000

images then your scale of data becomes

so huge for any machine learning

algorithm to easily handle it and your

machine learning algorithms fail in

terms of getting any interesting results

out of it. So coming to solution part

that we need an architecture which can

not only read such data in terms of

images but it is capable enough of

dealing with such huge dimensions of

data. So this is the second benefit

which comes from the deep learning

algorithms and we'll be discussing how

do they manage such high dimensional

data when we go and talk about different

architectures. And third and the most

important reason that we will be looking

for a different kind of model structure

or different kind of algorithm is for

identifying the features. So in machine

learning algorithms we as data

scientists spend lot of time in kind of

curating the important features. either

first you'll be scaling the features and

after scaling you'll be creating the

interaction variables then you'll be

creating if the separator is not very

clear then you need to introduce

highdimensional data let's say it's your

data point and if you see that line

you're fitting is not separating clearly

then some of you would be trying the

higher order polomials of your input

features so all such things which not

only difficult to you know come up there

is lot of trial and error that which

kind of transformation and which kind of

variable creation. So what kind of

variable will really work for

classification problem? That's first

thing. And second is if you're working

on higher order polomials, what is the

correct order of polomial I need to

create it. And just to give you the

scale of it, let's say you are dealing

with only 100 features and you need to

create second order polomial with

interaction of all these 100 features.

Then you'll end up getting around 5,000

features from the second order

polinomial only. If you want to get

third order polomial like cube variables

or or the interaction of three variables

together then these 100 variables will

come around 170,000 features. So this

creation of features is very very

difficult. If we go and start creating

these features on our own and our

objective let's say to identify a

television in the image and we have some

pictures where we need to identify even

though you have created those features

manually and some of you who are working

in the field of computer science for

quite some time would know that earlier

we used to use features like sift s if

par features and hog features but these

are like kind of static features for a

given object but we may argue that this

television is there in this picture here

but in other picture it can be somewhere

else. So the feature which I'm

identifying it has to be spatial

indifference that it can be anywhere in

the image and same goes for language

that if you're dealing with language

data it should be not only able to

understand the meaning of word or how

does the word fit into the sentence but

should also be able to understand that

what is the context of each word. But

these word embeddings neural network

help you understand it what are the

related word to a given word and from

that you make predictions. So these

broad problems of machine learning

algorithms one is they are not being

able to play with or deal with different

type of data like images and natural

language. Second is the dimensional

problem if the dimensionality goes in

like 100,000 features and all. And third

is this feature creation on its own. So

these are the three basic reasons that

one of you or all of you would be

interested in going to one of the deep

learning architecture for solving such

problems. And fourthly, if I may add it

that all the deep learning architectures

given that we are putting lot of

computational powers in them, they end

up giving you a better accuracy both for

classification and regression problems.

So that that's the fourth benefit. And

how does it really work? There are

different stages in a deep learning and

why they are called deep because it's

not just input and output like we have

seen in regression that you have a y and

x is some kind of linear equation here

we have different intermediate field

like but there are lot of intermediary

field for doing such complex

calculations so that all these features

which I mentioned that suppose you need

to identify a television all such

features get calculated at different

stages one After the other and final

stage you have very very refined

features not only for image we are

taking the image classification but any

problem we are trying to solve through

the multiple stages your model would be

able to learn these intelligent features

which are really important for your

classification or regression or any such

problem which we are trying to solve. So

these were the few benefits for deep

learning and these are actually the

broad reasons that somebody would be

interested in learning the deep learning

algorithms.

>> [music]

>> So this is the problem statement guys.

We need to figure out if the bank notes

are real or fake and for that we'll be

using artificial neural networks and

obviously we need some sort of data in

order to train our network. So let us

see how the data set looks like. So over

here I've taken a screenshot of the data

set with few of the rows in it. Data

were extracted from images that were

taken from genuine and forged

banknotelike specimens. After that,

wavelength transform tools were used to

extract features from those images. And

these are few features that I'm

highlighting with my cursor. And the

final column or the last column actually

represents the label. So basically label

tells us to which class that pattern

represents whether that pattern

represents a fake note or it represents

a real node. Let us discuss these

features and labels one by one. So the

first feature or the first column is

nothing but variance of wavelength

transformed image. The second column is

about skewess. The third is courtesis of

wavelength transformed image and finally

fourth one is entropy of the image.

After that when I talk about label which

is nothing but my last column over here

if the value is one that means the

pattern represents a real node whereas

when value is zero that means it

represents a fake node. So guys let's

move forward and we'll see what are the

various steps involved in order to

implement this use case. So over here

we'll first begin by reading the data

set that we have. We'll define features

and labels. After that we are going to

encode the dependent variable. And what

is a dependent variable? It is nothing

but your label. Then we are going to

divide the data set into two parts. One

for training, another for testing. After

that we'll use TensorFlow data

structures for holding features, labels,

etc. And TensorFlow is nothing but a

Python library that is used in order to

implement deep learning models or you

can say neural networks. Then we'll

write the code in order to implement the

model. And once this is done, we will

train our model on the training data.

We'll calculate the error. The error is

nothing but your difference between the

model output and the actual output and

we'll try to reduce this error and once

this error becomes minimum we'll make

prediction on the test data and we'll

calculate the final accuracy. So guys

let me quickly open my PyCharm and I'll

show you how the output looks like. So

this is my PyCharm guys over here I've

already written the code in order to

execute the use case. I'll go ahead and

run this and I'll show you the output.

So over here as you can see with every

iteration the accuracy is increasing. So

let me just stop it right here. All

right. Till now any questions any doubts

with respect to what is our use case?

What is the data set about? So we'll

move forward and we'll understand why we

need neural networks. So in order to

understand why we need neural networks,

we are going to compare the approach

before and after neural networks and

we'll see what were the various problems

that were there before neural networks.

So earlier conventional computers use an

algorithmic approach that is the

computer follows a set of instructions

in order to solve a problem and unless

the specific steps that the computer

needs to follow are known the computer

cannot solve the problem. So obviously

we need a person who actually knows how

to solve that problem and he or she can

provide the instructions to the computer

as to how to solve that particular

problem. Right? So we first should know

the answer to that problem or we should

know how to overcome that challenge or

problem which is there in front of us.

Then only we can provide instructions to

the computer. So this restricts the

problem solving capability of

conventional computers to problems that

we already understand and know how to

solve. But what about those problems

whose answer we have no clue of. So

that's where our traditional approach

was a failure. So that's why neural

networks were introduced. Now let us see

what was the scenario after neural

networks. So neural networks basically

process information in a similar way the

human brain does and these networks they

actually learn from examples. You cannot

program them to perform a specific task.

They will learn from their examples from

their experience. So you don't need to

provide all the instructions to perform

a specific task and your network will

learn on its own with its own

experience. All right. So this is what

basically neural network does. So even

if you don't know how to solve a

problem, you can train your network in

such a way that with experience it can

actually learn how to solve the problem.

So that was a major reason why neural

networks came into existence. So these

neural networks are basically inspired

by neurons which are nothing but your

brain cells and the exact working of the

human brain is still a mystery though.

So as I've told you earlier as well that

neural networks work like human brain

and so the name and similar to a newborn

human baby as he or she learns from his

or her experience we want a network to

do that as well but we wanted to do very

quickly. So here's a diagram of a

neuron. Basically a biological neuron

receives input from other sources

combines them in some way perform a

generally nonlinear operation on the

result and then outputs the final

result. So here if you notice these

dendrites these dendrites will receive

signals from the other neurons. Then

what will happen? It will transfer it to

the cell body. The cell body will

perform some function. It can be

summation can be multiplication. So

after performing that summation on the

set of inputs via exxon it is

transferred to the next neuron. Now

let's understand what exactly are

artificial neural networks. It is

basically a computing system that is

designed to simulate the way the human

brain analyzes and process the

information. Artificial neural networks

has self-arning capabilities that enable

it to produce better results as more

data becomes available. So if you train

your network on more data, it'll be more

accurate. So these neural networks, they

actually learn by example. And you can

configure your neural network for

specific applications. It can be pattern

recognition or it can be data

classification, anything like that. All

right. So because of neural networks we

see a lot of new technology has evolved

from translating web pages to other

languages to having a virtual assistant

to order groceries online to conversing

with chat bots. All of these things are

possible because of neural networks. So

in a nutshell if I need to tell you

artificial neural network is nothing but

a network of various artificial neurons.

All right. So let me show you the

importance of neural network with two

scenarios before and after neural

network. So over here we have a machine

and we have trained this machine on four

types of dogs as you can see where I'm

highlighting with my cursor and once the

training is done we provide a random

image to this particular machine which

has a dog but this dog is not like the

other dogs on which we have trained our

system on. So without neural networks

our machine cannot identify that dog in

the picture as you can see it over here.

Basically our machine will be confused.

It cannot figure out where the dog is.

Now when I talk about neural networks,

even if we have not trained our machine

on this specific dog, but still it can

identify certain features of the dogs

that we have trained on and it can match

those features with the dog that is

there in this particular image and it

can identify that dog. So this happens

all because of neural networks. So this

is just an example to show you how

important are neural networks. Now I

know you all must be thinking how neural

networks work. So for that we'll move

forward and understand how it actually

works. So over here I'll begin by first

explaining a single artificial neuron

that is called as perceptron. So this is

an example of a perceptron. Over here we

have multiple inputs x1 x2 dash till xn

and we have corresponding weights as

well. W1 for x1 w2 for x2 similarly wn

for xn. Then what happens? We calculated

the weighted sum of these inputs. And

after doing that we pass it through an

activation function. This activation

function is nothing but it provides a

threshold value. So above that value my

neuron will fire else it won't fire. So

this is basically an artificial neuron.

So when I talk about a neural network it

involves a lot of these artificial

neurons with their own activation

function and their processing element.

Now we'll move forward and we'll

actually understand various modes of

this perceptron or single artificial

neuron. So there are two modes in a

perceptron. One is training, another is

using mode. In training mode, the neuron

can be trained to fire for particular

input patterns. Which means that we'll

actually train our neuron to fire on

certain set of inputs and to not fire on

the other set of inputs. That's what

basically training mode is. When I talk

about using mode, it means that when a

tot input pattern is detected at the

input, its associated output becomes the

current output. Which means that once

the training is done and we provide an

input on which the neuron has been

trained on so it'll detect the input and

we'll provide the associated output. So

that's what basically using mode is. So

first you need to train it then only you

can use your perceptron or your uh

network. So these were the two modes

guys. Next up we'll understand what are

the various activation functions

available. So these are the three

activation functions although there are

many more but I've listed down three

step function. So over here the moment

your input is greater than this

particular value your neuron will fire

else it won't. Similarly for sigmoid and

sine function as well. So these are

three activation functions. There are

many more that I've told you earlier as

well. So these are the three majorly

used activation functions. Next up what

we are going to do we are going to

understand how a neuron learns from its

experience. So I'll give you a very good

analogy in order to understand that and

later on when we talk about neural

networks or you can say multiple neurons

in a network I'll explain you the maths

behind it. I'll explain you the math

behind learning how it actually happens.

So right now I'll explain you with an

analogy and guys trust me that analogy

is pretty interesting. So I know all of

you must have guessed it. So these are

two beer mugs and all of you who love

beer can actually relate to this analogy

a lot and I know most of you actually

love beer. So that's why I've chosen

this particular analogy so that all of

you can relate to it. All right, jokes

apart. So fine guys, so there's a beer

festival happening near your house and

you want to badly go there. But your

decision actually depends on three

factors. First is how is the weather,

whether it is good or bad. Second is

your wife or husband is going with you

or not. And the third one is any public

transport is available. So on these

three factors, your decision will depend

whether you'll go or not. So we'll

consider these three factors as inputs

to our perceptron and we'll consider our

decision of going or not going to the

beer festival as our output. So let us

move forward with that. So we'll move

forward and we'll see what are the

various inputs that I'm talking about.

So the first input is how is the

weather? We'll consider it as x1. So

when weather is good it'll be one and

when it is bad it'll be zero. Similarly

your wife is going with you or not. So

that be your x2. If she is going then

it's one. If she's not going then it's

zero. Similarly for public transport if

it is available then it is one else it

is zero. So these are the three inputs

that I'm talking about. Let's see the

output. So output will be one when

you're going to the beer festival and

output will be zero when you want to

relax at home. You want to have beer at

home only. You don't want to go outside.

So these are the two outputs whether you

are going or you're not going. Now what

a human brain does over here. Okay fine

I need to go to the beer festival but

there are three things that I need to

consider. But will I give importance to

all these factors equally? Definitely

not. There'll be certain factors which

will be of higher priority for me. I'll

focus on those factors more. Whereas few

factors won't affect that much to me.

All right. So let's prioritize our

inputs or factors. So here our most

important factor is weather. So if

weather is good, I love beer so much

that I don't care even if my wife is

going with me or not or if there is a

public transport available. So I love

beer that much that if weather is good

that definitely I'm going there. That

means when x1 is high output will be

definitely high. So how we do that? How

we actually prioritize our factors or

how we actually give importance more to

a particular input and less to another

input in a perceptron or in a neuron. So

we do that by using weights. So we

assign high weights to the more

important factors or more important

inputs and we assign low weights to

those particular inputs which are not

that important for us. So let's assign

weights guys. So weight w is associated

with input x1, w2 with x2 and similarly

w3 with x3. Now as I've told you earlier

as well that weather is a very important

factor. So I'll assign a pretty high

weight to weather and I'll keep it as

six. Similarly w2 and w3 are not that

important. So I'll keep it as 22. After

that I've defined a threshold value as

five which means that when the weighted

sum of my input is greater than five

then only my neuron will fire or you can

say then only I'll be going to the

bfest. All right. So I'll use my pen and

we'll see what happens when weather is

good. So when weather is good, our x1 is

1. Our weight is six. We'll multiply it

with six. Then

if my wife decides that she is going to

stay at home and she will probably be

busy with cooking and she doesn't want

to drink beer with me, so she's not

coming. So that input becomes zero. 0

into 2 will actually make no difference

because it'll be zero. Then again there

is no public transport available also.

Then also this will be 0 into 2.

So what output I get here? I get here as

six.

And notice the threshold value it is

five. So definitely six is greater than

five.

That means my output

will be [snorts] one or you can say my

neuron will fire or I'll actually go to

the beer festival. So even if these two

inputs are zero for me that means my

wife is not willing to go with me and

there is no public transport available

but weather is good which has very high

weight value and it actually matters a

lot to me. So if that is high it doesn't

really matter whether the two inputs are

high or not I will definitely go to the

BF festival. All right now I'll explain

you a different scenario. So over here

our threshold was five but what if I

change this threshold to three. So in

that scenario even if my weather is not

good uh I'll give it the zero. So 0 into

6 but my wife and public transport both

are available. All right. So 1 into 2 +

1 into 2

which is equal to 4 and it is definitely

greater than three.

Then also my output will be one. that

means I will definitely go to the beer

festival even if weather is bad and my

neuron will fire. So these are the two

scenarios that I have discussed with

you. All right. So there can be many

other ways in which you can actually

assign weight to your uh problem or to

your learning algorithm. So these are

the two ways in which you can assign

weights and prioritize your inputs or

factors on which your output will

depend. So obviously in real life all

the inputs or all the factors are not as

important for you. So you actually

prioritize them and how you do that in

perceptron you provide high weight to

it. This is just an analogy so that you

can relate to a perceptron to a real

life. We'll actually discuss the maths

behind it later in the session as to how

a network or a neuron learns. All right.

So how the weights are actually updated

and how the output is changing that all

those things we'll be discussing later

in the session. But my aim is to make

you understand that you can actually

relate to a real life problem with that

of a perceptron. All right? And in real

life problems are not that easy. They

are very very complex problems that we

actually face. So in order to solve

those problems, a single neuron is

definitely not enough. So we need

networks of neuron and that's where

artificial neural network or you can say

multi-layer perceptron comes into the

picture. Now let us discuss that

multi-layer perceptron or artificial

neural network. So this is how an

artificial neural network actually looks

like. So over here we have multiple

neurons in present in different layers.

The first layer is always your input

layer. This is where you actually feed

in all of your inputs. Then we have the

first hidden layer. Then we have second

hidden layer and then we have the output

layer. Although the number of hidden

layers depend on your application on

what are you working what is your

problem. So that actually determines how

many hidden layers you'll have. So let

me explain you what is actually

happening here. So you provide in some

input to the first layer which is

nothing but your input layer. You

provide inputs to these neurons. All

right? And after some function the

output of these neurons will become the

input to the next layer which is nothing

but your hidden layer one. Then these

hidden layers also have various neurons.

These neurons will have different

activation functions. So they'll perform

their own function on the inputs that it

receives from the previous layer. And

then the output of this layer will be

the input to the next hidden layer which

is hidden layer 2. Similarly, the output

of this hidden layer will be the input

to the output layer and finally we get

the output. So this is how basically an

artificial neural network looks like.

Now let me explain you this with an

example. So over here I'll take an

example of image recognition using

neural networks. So over here what

happens? We feed in a lot of images to

our input layer. Now this input layer

will actually detect the patterns of

local contrast and then we'll feed that

to the next layer which is hidden layer

one. So in this hidden layer one the

face features will be recognized we'll

recognize eyes nose ears things like

that and then that will be again fed as

input to the next hidden layer and in

this hidden layer we'll assemble those

features and we'll try to make a face

and then we'll get the output that is

the face will be recognized properly. So

if you notice here with every layer we

are trying to get a more abstract

version or the generalized version of

the input. So this is now basically an

artificial neural network how it works.

All right. And there's a lot of training

and learning which is involved that I'll

show you now training a neural network.

So how we actually train our neural

network. So basically the most common

algorithm for training a network is

called back propagation. So what happens

in back propagation after the weighted

sum of inputs and passing through an

activation function and getting the

output. We compare that output to the

actual output that we already know. We

figure out how much is the difference.

We calculate the error and based on that

error what we do we propagate backwards

and we'll see what happens when we

change the weight will the error

decrease or will it increase and if it

increases when it increases by

increasing the value of the variables or

by decreasing the value of variables. So

we kind of calculate all those things

and we update our variables in such a

way that our error becomes minimum and

it takes a lot of iterations. Trust me

guys it takes a lot of iterations. We

get output a lot of times and then we

compare it with the model with the

actual output. Then again we propagate

backwards. We change the variables and

again we calculate the output. We

compare it again with the desired output

of the actual output. Then again we

propagate backwards. So this process

keeps on repeating until we get the

minimum value. All right. So there's an

example that is there in front of your

screen. Don't be scared of the terms

that I used. I'll actually explain you

with an example. So this is the example

over here. We have 0 1 and two as

inputs. And our desired output or the

output that we already know is 0 1 and

4. All right. So over here we can

actually figure out that desired output

is nothing but twice of your input. But

I'm training a computer to do that.

Right? The computer is not a human. So

what happens? I actually initialize my

weight. I keep the value as three. So

the model output will be 3 into 0 is 0.

3 into 1 is 3. 3 into 2 is 6. Now

obviously it is not equal to your

desired output. So we check the error.

Now the error that we have got here is 0

1 and 2 which is nothing but your

difference. So 0 - 0 is 0 3 - 2 is 1 6 -

4 is 2. Now this is called an absolute

error. After squaring this error we get

square error which is nothing but 0 1

and 4. All right. So now what we need to

do we need to update the variables. We

have seen that the output that we got is

actually different from the desired

output. So we need to update the value

of the weight. So instead of three our

computer makes it as four. After making

the value as four, we get the model

output as 0 4 and 8. And then we saw

that the error has actually increased.

Instead of decreasing, the error has

increased. So after updating the

variable, the error has increased. So

you can see that square error is now 0 4

and 16. And earlier it was 0 1 and 4.

That means we cannot increase the weight

value right now. But if we decrease that

make it as two, we get the output which

is actually equal to desired out. But is

it always the case that we need to only

decrease the weight? Definitely not. So

in this particular scenario, whenever

I'm increasing the weight, error is

increasing and when I'm decreasing the

weight, error is decreasing. But as I've

told you earlier as well, this is not

the case every time. Sometimes you need

to increase the weight as well. So how

we determine that? All right. Fine guys,

this is how basically a computer decide

whether it has to increase the weight or

decrease the weight. So what happens

here? This is a graph of square error

versus weight. So over here what

happens? Suppose your square error is

somewhere here and your computer it

starts increasing the weight in order to

reduce the square error and it notices

that whenever it increases the weight

square error is actually decreasing. So

it'll keep on increasing until the

square error reaches a minimum value and

after that when it tries to still

increase the weight the square error

will increase. So at that time our

network will recognize that whenever it

is increasing the weight after this

point error is increasing. So therefore

it will stop right there and that will

be our weight value. Similarly there can

be one more scenario. Suppose if we

increase the weight but then also the

square error is increasing. So at that

time we cannot increase the weight. At

that time computer will realize okay

fine whenever I'm increasing the weight

the square error is increasing. So it'll

go in the opposite direction. So it'll

start decreasing the weight and it'll

keep on doing that until the square

error becomes minimum. And the moment it

decreases more the square error again

increases. So our network will know that

whenever it decreases the weight value

the square error is increasing. So that

point will be our final weight value. So

guys this is what basically back

propagation in a nutshell is. If you

have any questions or doubts you can go

ahead and ask me. All right fine we have

no doubts here. Fine. So we'll move

forward and now is the correct time to

understand how to implement the use case

that I was talking about at the

beginning. That is how to determine

whether a node is fake or real. So for

that I'll open my PyCharm. This is my

PyCharm again guys. Uh let me just close

this. All right. So this is the code

that I've written in order to implement

the use case. So over here what we do we

import the first important libraries

which are required. Mattplot lab is used

for visualization. TensorFlow we know in

order to implement the neural networks.

Numpy for arrays pandas for reading the

data set. Similarly sklearn for label

encoding as well as for shuffing and

also to split the data set into training

and testing task. All right fine guys.

So we'll begin by first reading the data

set as I've told you earlier as well

when I was explaining the steps. So what

I'll do I'll use pandas in order to read

the CSV file which has the data set.

After that I'll define features and

labels. So x will be my feature and y

will contain my label. So basically x

includes all the columns apart from the

last column which is the fifth one. And

because the indexing starts from zero

that's why we have written zero till

fourth. So it won't include the fourth

column. All right. And so our last

column will actually be our label. Then

what we need to do, we need to encode

the dependent variable. So the dependent

variable as I've told you earlier as

well is nothing but your label. So I've

discussed encoding in TensorFlow

tutorial. You can go through it and you

can actually get to know why and how we

do that. Then what we have done, we have

read the data set. Then what we need to

do is to split our data set into

training and testing. And these are all

optional steps. You can print the shape

of your training and test data. If you

don't want to do it, it's still fine.

Then we have defined learning rate. So

learning rate is actually the steps in

which the weights will be updated. All

right. So that is what basically

learning rate is. Then when we talk

about epoch means iterations. Then we

have defined cost history that will be

an empty numpy array and it shape will

be one and it'll include the flow type

objects. Then we have defined end which

is nothing but your x shape of axis one

which means your column. Then we'll

print that. After that we have defined

the number of classes. So there can be

only two class whether the node can be

fake or it can be real. And this model

path I've given in order to save my

model. So I've just given a path where I

need to save it. So I'll just save it

here only in the current working

directory. Now is a time to actually

define our neural network. So we'll

first make sure that we have defined the

important parameters like hidden layers,

number of neurons in hidden layers. So

I'll take 10 neurons in every hidden

layer and I'm taking four layers like

that. Then x will be my placeholder and

the shape of this particular placeholder

is none, n dim. n dim value. I'll get it

from here and none can be any value.

I'll define one variable w and I'll

initialize it with zeros and this will

be the shape of my weight. Similarly for

bias as well. This will be the

particular shape and there will be one

more placeholder ydash which will

actually be used in order to provide us

with the actual output of the model.

There'll be one model output and

there'll be one actual output which we

use in order to calculate the

difference. Right? So we'll feed in the

actual values of the labels in this

particular placeholder ydash. And now

we'll define the model. So over here we

have name the function as multi-layer

perceptron. And in it we'll first define

the first layer. So the first hidden

layer and we are going to name it as

layer_1 which will be nothing but the

matrix multiplication of x and weights

of h1 that is the hidden layer 1 and

that'll be added to your biases b1 after

that we'll pass it through a sigmoid

activation function. Similarly in layer

2 as well matrix multiplication of layer

1 and weights of h2. So if you can

notice layer 1 was the network layer

just before the layer two right. So the

output of this layer 1 will become input

to the layer 2 and that's why we have

written layer_1 it'll be multiplied by

weight h2 and then we'll add it with the

bias. Similarly for this particular

hidden layer as well and this particular

layer as well but over here we are going

to use rail activation function instead

of sigmoid. Then we are going to define

the weights and biases. So this is how

we basically define weights. This is how

we basically define weights. So weights

h1 will be a variable which will be a

truncated normal with the shape of n dim

and n hidden_1. So these are nothing but

your shapes. All right. And after that

what we have done we have defined biases

as well. Then we need to initialize all

the variables. So all these things

actually I've discussed in brief when I

was talking about tensorflow. So you can

go through tensorflow tutorial at any

point of time if you have any question.

We have discussed everything there.

Since in tensorflow we need to

initialize the variables before we use

it. So that's how we do it. We first

initialize it and then we need to run

it. That's when your variables will be

initialized. After that we are going to

create a saver object and then finally

I'm going to call my model and then

comes the part where the training

happens. Cost function. Cost function is

nothing but you can say an error that

will be calculated between the actual

output and the model output. All right.

So y is nothing but our model output and

ydash is nothing but actual output or

the output that we already know. All

right. And then we are going to use a

gradient descent optimizer to reduce

error. Then we are going to create a

session object as well. And finally what

we are going to do we are going to run

the session. So this is how we basically

do that. For every epoch we will be

calculating the change in the error as

well as the accuracy that comes after

every epoch on the training data. After

we have calculated the accuracy on the

training data, we going to plot it for

every epoch how the accuracy is. And

after plotting that we going to print

the final accuracy which will be on our

test data. So using the same model we'll

make prediction on the test data and

after that we are going to print the

final accuracy and the mean squared

error. So let's go ahead and execute

this guys.

All right. So training is done and this

is the graph we have got for accuracy

versus epoch. This is accuracy. Y-axis

represents accuracy whereas this is

epox. We have taken 100 epochs and our

accuracy has reached somewhere around

99%. So with every epoch it is actually

increasing apart from a couple of

instances it is actually keep on

increasing. So the more data you train

your model on it'll be more accurate.

Let me just close it. So now the model

has also been saved where I wanted it to

be. This is my final test accuracy and

this is the mean squared error. All

right. So these are the files that will

appear once you save your model. These

are the four files that I've

highlighted. Now what we need to do is

restore this particular model and I've

explained this in detail how to restore

a model that you have already saved. So

over here what I'll do I'll take some

random range. I've taken it actually

from 754 to 768. So all the values in

the row of 754 and 768 will be fed to

our model and our model will make

prediction on that. So let us go ahead

and run this.

So when I'm restoring my model, it seems

that my model is 100% accurate for the

values that I have fed in. So whatever

values that I have actually given as

input to my model, it has correctly

identified its class whether it's a fake

node or a real node because zero stands

for fake node and one stands for real

node. Okay. So original class is nothing

but which is there in my data set. So it

is zero already and what prediction my

model has made is zero. That means it is

fake. So accuracy becomes 100%.

Similarly for other values as well.

Fine guys. So this is how we basically

implement the use case that we saw in

the beginning. So in the slide you can

notice that I've listed out only two

applications although there are many

more. So neural networks in medicine.

Artificial neural networks are currently

a very hot research area in medicine and

it is believed that they will receive

extensive application to biomedical

systems in the next few years and

currently the research is mostly on

modeling parts of human body and

recognizing diseases from various scans.

For example, it can be cardiograms, CAT

scans, ultrasonic scans etc. And

currently the research is going mostly

on two major areas. First is modeling

and diagnosing the cardiovascular

system. So neural networks are used

experimentally to model the human

cardiovascular system. Diagnosis can be

achieved by building a model of the

cardiovascular system of an individual

and comparing it with the real-time

physiological measurements taken from

the patient. And trust me guys, if this

routine is carried out regularly,

potential harmful medical conditions can

be detected at an early stage and thus

make the process of combating disease

much easier. Apart from that it is

currently being used in electronic noses

as well. Electronic noses have several

potential applications in tele medicine.

Now let me just give you an introduction

to tele medicine. Tele medicine is a

practice of medicine over long distance

via a communication link. So what the

electronic noses will do? They would

identify odors in the remote surgical

environment. These identified odors

would then be electronically transmitted

to another site where an door generation

system would recreate them. Because the

sense of [snorts] the smell can be an

important sense to the surgeon. Teley

smell would enhance teleresent surgery.

So these are the two ways in which you

can use it in medicine. You can use it

in business as well guys. So business is

basically a diverted field with several

general areas of specialization such as

accounting or financial analysis. Almost

any neural network application would fit

into one business area or financial

analysis. Now there is some potential

for using neural networks for business

purposes including resource allocation

and scheduling. I have listed down two

major areas where it can be used. One is

marketing. So there is a marketing

application which has been integrated

with a neural network system. The

airline marketing tactician is a

computer system made of various

intelligent technologies including

expert systems. A feed forward neural

network is integrated with the AMT which

is nothing but airline marketing

tactician and was trained using back

propagation to assist the marketing

control of airline seat location. So it

has wide applications in marketing as

well. Now the second area is credit

evaluation. Now I'll give you an example

here. The HNC company has developed

several neural network applications and

one of them is a credit scoring system

which increases the profitability of

existing model up to 27%. So these are

few applications that I'm telling you

guys neural network is actually the

future. People are talking about neural

networks everywhere and especially after

the introduction of GPUs and the amount

of data that we have now neural network

is actually spreading like plague right

now.

What is KN&N algorithm? Well, K nearest

neighbor is a simple algorithm that

stores all the available cases and

classify the new data or case based on a

similarity measure. It suggests that if

you are similar to your neighbors, then

you are one of them, right? For example,

if apple looks more similar to banana,

orange or melon rather than a monkey,

rat or a cat, then most likely apple

belong to the group of fruits. All

right? Well, in general, KN&N is used in

search application where you're looking

for similar items. That is when your

task is some form of find items similar

to this one. Then you call this search

as a KN&N search. But what is this K in

KN&N? Well, the K denotes the number of

nearest neighbor which are voting class

of the new data or the testing data. For

example, if K equal 1, then the testing

data are given the same label as the

closest example in the training set.

Similarly, if K equal 3, the labels of

the three closest classes are checked

and the most common label is assigned to

the testing data. So this is what a K

and KN&N algorithm means. So moving on

ahead, let's see some of the example of

scenarios where KNN is used in the

industry. So let's see the industrial

application of KN&N algorithm starting

with recommended system. Well, the

biggest use case of KN&N search is a

recommended system. This recommended

system is like an automated form of a

shop counter guy. When you ask him for a

product, not only shows you the product

but also suggests you or displays your

relevant set of products which are

related to the item you're already

interested in buying. This KN&N

algorithm applies to recommending

products like an Amazon or for

recommending media like in case of

Netflix or even for recommending

advertisement to display to a user. If

I'm not wrong, almost all of you must

have used Amazon for shopping. Right? So

just to tell you more than 35% of

Amazon.com's revenue is generated by its

recommendation engine. So what's their

strategy? Amazon uses recommendation as

a targeted marketing tool in both the

email campaigns and on most of its

website pages. Amazon will recommend

many products from different categories

based on what you are browsing and it

will pull those products in front of you

which you are likely to buy like the

frequently bought together option that

comes at the bottom of the product page

to tempt you into buying the combo.

Well, this recommendation has just one

main goal that is increase average order

value or to upsell and cross-ell

customers by providing product

suggestion based on items in the

shopping cart or based on the product

they are currently looking at on site.

So, next industrial application of KN&N

algorithm is concept search or searching

semantically similar documents and

classifying documents containing similar

topics. So, as you know the data on the

internet is increasing exponentially

every single second. There are billions

and billions of documents on the

internet. Each document on the internet

contains multiple concepts that could be

a potential concept. Now there's a

situation where the main problem is to

extract concept from a set of documents

as each page could have thousands of

combination that could be potential

concepts. An average document could have

millions of concept. Combine that with

the vast amount of data on the web.

Well, we are talking about an enormous

amount of data set and sample. So what

we need here? We need to find a concept

from the enormous amount of data set and

samples. Right? So for this purpose,

we'll be using KN&N algorithm. More

advanced example could include

handwriting detection like an OCR or

image recognization or even video

recognization.

All right. So now that you know various

use cases of KN algorithm, let's proceed

and see how does it work. So how does a

KN algorithm work? Let's start by

plotting these blue and orange point on

our graph. So these blue points they

belong to class A and the orange ones

they belong to class B. Now you get a

star as a new point and your task is to

predict whether this new point it

belongs to class A or it belongs to

class B. So to start the prediction the

very first thing that you have to do is

select the value of K. Just as I told

you K in KN&N algorithm refers to the

number of nearest neighbors that you

want to select. For example, in this

case K equal 3. So what does it mean? It

means that I'm selecting three points

which are the least distance to the new

point or you can say I'm selecting three

different points which are closest to

the star. Well, at this point of time

you can ask how will you calculate the

least distance. So once you calculate

the distance you'll get one blue and two

orange points which are closest to the

star. Now since in this case as we have

a majority of orange points so you can

say that for k= 3 the star belongs to

class b or you can say that the star is

more similar to the orange points.

Moving on ahead well what if k equal to

6. Well for this case you have to look

for six different points which are

closest to the star. So in this case

after calculating the distance we find

that we have four blue points and two

orange point which are closest to the

star. Now as you can see that the blue

points are in majority. So you can say

that for k equals 6 the star belongs to

class a or the star is more similar to

blue points. So by now I guess you know

how a kan algorithm work and what is the

significance of k in kan algorithm. So

how will you choose the value of k. So

keeping in mind this k is the most

important parameter in kan algorithm. So

let's see when you build a k nearest

neighbor classifier how will you choose

a value of k? Well, you might have a

specific value of K in mind or you could

divide up your data and use something

like cross validation technique to test

several values of K in order to

determine which works best for your

data. For example, if n equal thousand

cases, then in that case the optimal

value of k lies somewhere in between 1

to 19. But yes, unless you try it, you

cannot be sure of it. So you know how

the algorithm is working on a higher

level. Let's move on and see how things

are predicted using Kenan algorithm.

Remember I told you the KN&N algorithm

uses the least distance measure in order

to find its nearest neighbors. So let's

see how these distances calculated.

Well, there are several distance measure

which can be used. So to start with,

we'll mainly focus on ukidian distance

and Manhattan distance in this session.

So what is this ukidian distance? Well,

this ukidian distance is defined as the

square root of the sum of difference

between a new point X and an existing

point Y. So for example, here we have

point P1 and P2. Point P1 is 111 and

point P2 is 54. So what is the ukidian

distance between both of them? So you

can say that ukidian distance is the

direct distance between two points. So

what is the distance between the point

P1 and P2? So we can calculate it as 5 -

1 square + 4 - 1 square and we can root

it over which results to 5. So next is

the Manhattan distance. Well, this

Manhattan distance is used to calculate

the distance between real vector using

the sum of their absolute difference. In

this case, the Manhattan distance

between the point P1 and P2 is mod of 5

- 1 plus mod value of 4 - 1 which

results to 3 + 4 that is 7. So this

slide shows the difference between

ukidian and Manhattan distance from

point A to point B. So ukidian distance

is nothing but the direct or the least

possible distance between A and B.

Whereas the Manhattan distance is a

distance between A and B measured along

the axis at right angle. Let's take an

example and see how things are predicted

using Kenan algorithm or how the Kenan

algorithm is working. Suppose we have a

data set which consists of height,

weight and t-shirt size of some

customers. Now when a new customer come,

we only have his height and weight as

the information. Now our task is to

predict what is the t-shirt size of that

particular customer. So for this we'll

be using the KN&N algorithm. So the very

first thing what we need to do we need

to calculate the ukidian distance. So

now that you have a new data of height

161 cm and weight as 61 kg. So the very

first thing that we'll do is we'll

calculate the ukidian distance which is

nothing but the square root of 161 minus

158 square + 61 - 58 whole square and

square root of that is 4.24. Let's drag

and drop it. So these are the various

ukidian distance of other points. Now

let's suppose K equal to 5. Then the

algorithm what it does? It searches for

the five customer closest to the new

customer. That is most similar to the

new data in terms of its attribute. For

K equal 5. Let's find the top five

minimum ukidian distance. So these are

the distance which we are going to use 1

2 3 4 and 5. So let's rank them in the

order. First, this is second. This is

third. Then this one is fourth. And

again this one is five. So this is our

order. So for k equal 5 we have four

t-shirts which come under size m and one

t-shirt which comes under size L. So

obviously best guess or the best

prediction for the t-shirt size of

height 161 cm and weight 61 kg is M or

you can say that our new customer fit

into size M. Well, this was all about

the theoretical session. But before we

drill down to the coding part, let me

just tell you why people call KNN as a

lazy learner. Well, KNN for

classification is a very simple

algorithm. But that's not why they are

called lazy. KN&N is a lazy learner

because it doesn't have a discriminative

function from the training data. But

what it does it memorizes the training

data. There is no learning phase of the

model and all of the work happens at the

time a prediction is requested. So as

such this is the reason why KN&N is

often referred to as lazy learning

algorithm. So this was all about the

theoretical session. Now let's move on

to the coding part. So for the practical

implementation of the hands-on part,

I'll be using the iris data set. So this

data set consists of 150 observation. We

have four features and one class label.

The four features include the sephil

length, the sele, petal length and the

petal width. Whereas the class label

decides which flower belongs to which

category. So this was the description of

the data set which we are using. Now

let's move on and see what are the

step-by-step solution to perform a CANN

algorithm. So first we'll start by

handling the data. What we have to do?

We have to open the data set from the

CSV format and split the data set into

train and test part. Next, we'll check

the similarity where we have to

calculate the distance between two data

instances. Once we calculate the

distance, next we'll look for the

neighbor and select K neighbors which

are having the least distance from a new

point. Now once we get our neighbor,

then we'll generate a response from a

set of data instances. So this will

decide whether the new point belongs to

class A or class B. Finally, we'll

create the accuracy function and in the

end we'll tie it all together in the

main function. So let's start with our

code for implementing KN algorithm using

Python. I'll be using Jupyter notebook,

Python 3.0 installed on it. Now let's

move on and see how Ken algorithm can be

implemented using Python. So there's my

Jupyter notebook which is a web- based

interactive computing notebook

environment with Python 3.0 installed on

it. Let's set the launch. Yeah, it's

launching. So there's our Jupyter

notebook and we'll be writing our Python

codes on it. So the first thing that we

need to do is load our file. Our data is

in CSV format without a header line or

any code. We can open the file the open

function and read the data line using

the reader function in the CSV module.

So let's write a code to load our data

file. Let's execute the run button. So

once you execute the run button, you can

see the entire training data set as the

output. Next, we need to split the data

into a training data set that Kenan can

use to make prediction and a test data

set that we can use to evaluate the

accuracy of the model. So, we first need

to convert the FL measure that will

loaded as string into numbers that we

can work. Next, we need to split the

data set randomly into train and test. A

ratio of 67 is to 33 for test is to

train is a standard ratio which is used

for this purpose. So let's define a

function as load data set that loads a

CSV with a provided file name and split

it randomly into training and test data

set using the provided split ratio. So

this is our function load data set which

is using file name split ratio, training

data set and testing data set as its

input. All right. So let's execute the

run button and check for any errors. So

it's executed with zero errors. Let's

test this function. So this is our

training set, testing set, load data

set. So this is our function load data

set and inside that we are passing our

file iris data with a split ratio of

0.66 and training data set and test data

set. Let's see what our training data

set and test data set it's dividing

into. So it's giving a count of training

data set and testing data set. The total

number of training data set it has split

into is 97 and total number of test data

set we have is 53. So total number of

training data set we have here is 97 and

total number of test data set we have

here is 53. All right. Okay. So our

function load data set is performing

well. So let's move ahead to step two

which is similarity. So in order to make

prediction we need to calculate the

similarity between any two given data

instances. This is needed so that we can

locate the ko similar data instances in

the training data set and in turn make a

prediction. Given that all four FL

measurement are numeric and have same

unit. We can directly use the ukian

distance measure. This is nothing but

the square root of the sum of squared

differences between two arrays of the

number. Given that all the four flower

measurements are numeric and have same

unit, we can directly use the ukidian

distance measure which is nothing but

the square root of the sum of square

difference between two arrays of the

number. Additionally, we want to control

which field to include in the distance

calculation. So specifically we only

want to include first four attribute. So

our approach will be to limit the

ukidian distance to a fixed length. All

right. So let's define our ukidian

function. So this is a ukidian distance

function which takes instance one,

instance two and length as parameters.

Instance one and instance two are the

two points of which you want to

calculate the ukidian distance. Whereas

this length and denote that how many

attributes you want to include. Okay. So

there's our ukidian function. Let's

execute it. It's executing fine without

any errors. Let's test the function.

Suppose the data 1 or the first instance

consists of the data point as 222 and it

belongs to class A and data 2 consist of

444 and it belongs to class B. So when

we calculate the ukidian distance of

data 1 to data 2 and what we have to do

we have to consider only first three

features of them. All right. So let's

print the distance. As you can see here

the distance comes out to be 3.464. All

right. So this is nothing but the square

root of 4 - 2 square. So this distance

is nothing but the ukidian distance and

it is calculated as square roo<unk> of 4

- 2 square + 4 - 2 square that is

nothing but 3 * of 4 - 2 square that is

12 and square root of 12 is nothing but

3.464. All right. So now that we have

calculated the distance now we need to

look for k nearest neighbors. Now that

we have a similarity measure we can use

it to collect the ko similar instances

for a given unseen instance. Well, this

is a straightforward process of

calculating the distance for all the

instances and selecting a subset with

the smallest distance value. And now

what we have to do, we have to select

the smallest distance values. So for

that, we'll be defining a function as

get neighbors. So for that what we'll be

doing, we'll be defining a function as

get neighbors. What it will do? It will

return the k most similar neighbors from

the training set for a given test

instance. All right. So this is how our

get neighbors function look like. It

takes training data set and test

instance and K as its input. uh the K is

nothing but the number of nearest

neighbor you want to check for. All

right. So basically what you'll be

getting from this get neighbors function

is K different points having least

uklidian distance from the test

instance. All right. Let's execute it.

So the function executed without any

errors. So let's test our function. So

suppose the training data set includes

the data like 222 and it belongs to

class A and other data includes 444 and

it belongs to class B and our testing

instance is 555. And now we have to

predict whether this test instance

belongs to class A or it belongs to

class B. All right. For K equal 1, we

have to predict its nearest neighbor and

predict whether this test instance it

will belong to class A or will it belong

to class B. All right. So let's execute

the run button. All right. So on

executing the run button, you can see

that we have output as 444 and B. Our

new instance 555 is closest to 444 which

belongs to class B. All right. Now once

you have located the most similar

neighbor for a test instance, next task

is to predict a response based on those

neighbors. So how we can do that? Well,

we can do this by allowing each neighbor

to vote for their class attribute and

take the majority vote as a prediction.

Let's see how we can do that. So we have

a function as get response which takes

neighbors as the input. Well, this

neighbor was nothing but the output of

this get neighbor function. The output

of get neighbor function will be fed to

get response. All right, let's execute

the run button. It's executed. Let's

move ahead and test our function get

response. So we have a neighbor as 111.

It belongs to class A. 222 it belongs to

class A. 333 it belongs to class B. So

this response what it will do it will

store the value of get response by

passing this neighbor value. All right.

So what we want to check is we want to

predict whether our test instance 555 it

belongs to class A or class B when the

neighbors are 111 A 222 A and 333 B. So

let's check our response. Now that we

have created all the different function

which are required for a Kenan

algorithm. So important main concern is

how to evaluate the accuracy of the

prediction. An easy way to evaluate the

accuracy of the model is to calculate a

ratio of the total correct prediction to

all the prediction made. So for this

I'll be defining a function as get

accuracy and inside that I'll be passing

my test data set and the predictions get

accuracy function. Check it executed

without any error. Let's check it for a

sample data set. So we have our test

data set as 111 which belongs to class A

22 which again belongs to class 333

which belongs to class B. And my

predictions is for first test data it

predicted that it belongs to class A

which is true. For next it predicted

that belongs to class A which is again

true. And for the next again it

predicted that it belongs to class A

which is false in this case cuz the test

data belongs to class B. All right. So

in total we have two correct prediction

out of three. All right. So the ratio

will be 2x3 which is nothing but 66.66.

So our accuracy rate is 66.66.

So now that you have created all the

function that are required for KN&N

algorithm. Let's compile them into one

single main function. All right. So this

is our main function and we are using

iris data set with a split of 0.67 6 7

and the value of K is three. Let's see

what is the accuracy score of this.

Check how accurate our model is. So in

training data set we have 113 values and

in the test data set we have 37 values.

These are the predicted and the actual

values of the output. Okay. So in total

we got an accuracy of 97.29%.

Which is really very good. All right.

[music]

Now the success of human race is because

of the ability to communicate and share

information. Now that is where the

concept of language comes in. However,

many such standards came up resulting in

many such language which each language

having its own set of basic shapes

called alphabets. And the combination of

alphabets resulted in words and the

combination of these words arranged

meaningfully resulted in the formation

of a sentence. Now each language has a

set of rules that is used while

developing these sentences and these set

of rules are also known as grammar. Now

coming to today's world that is the 21st

century. According to the industry

estimates only 21% of the available data

is present in the structured format.

Data is being generated as we speak, as

we tweet, as we send messages on

WhatsApp, Facebook, Instagram or through

text messages. And the majority of this

data exists in the textual form which is

highly unstructured in nature. Now in

order to produce significant and

actionable insights from the text data,

it is important to get acquainted with

the techniques of text analysis. So

let's understand what is text analysis

or text mining. Now it is the process of

deriving meaningful information from

natural language text. And text mining

usually involves the process of

structuring the input text, deriving

patterns within the structured data and

finally evaluating the interpreted

output. Compared with the kind of data

stored in database, text is

unstructured, amorphous and difficult to

deal with algorithmically.

Nevertheless, in the modern culture,

text is the most common vehicle for the

former exchange of information. Now as

text mining refers to the process of

deriving highquality information from

text the overall goal here is to turn

the text into data for analysis and this

is done by the application of NLP or

natural language processing. So let's

understand what is natural language

processing. So NLP refers to the

artificial intelligence method of

communicating with an intelligence

system using natural language. By

utilizing NLP and its components, one

can organize the massive chunks of

textual data, perform numerous or

automated task and solve a wide range of

problems such as automatic

summarization, machine translation,

named entity recognition, speech

recognition, and topic segmentation. So

let's understand the basic structure of

an NLP application. Considering the

chatbot here as an example, we can see

first we have the NLP layer which is

connected to the knowledge base and the

data storage. Now the knowledge base is

where we have the source content that is

we have all the chat logs which contain

a large history of all the chats which

are used to train the particular

algorithm and again we have the data

storage where we have the interaction

history and the analytics of that

interaction which in turn helps the NLP

layer to generate the meaningful output.

So now if we have a look at the various

applications of NLP. First of all we

have sentimental analysis. Now this is a

field where NLP is used heavily. We have

speech recognition. Now here we are also

talking about the voice assistants like

Google assistant, Cortana and the Siri.

Now next we have the implementation of

chatbot as I discussed earlier just now.

Now you might have used the customer

care chat services of any app. It also

uses NLP to process the data entered and

provide the response based on the input.

Now machine translation is also another

use case of natural language processing.

Now considering the most common example

here would be the Google translate. It

uses NLP and translates the data from

one language to another and that too in

real time. Now other applications of NLP

includes spellchecking. Then we have the

keyword search which is also a big field

where NLP is used. Extracting

information from any particular website

or any particular document is also a use

case of NLP. And one of the coolest

application of NLP is advertisement

matching. Now here what we mean is

basically recommendation of the ads

based on your history. Now NLP is

divided into two major components that

is the natural language understanding

which is also known as NLU and we have

the natural language generation which is

also known as NLG. The understanding

involves tasks like mapping the given

input into natural language into useful

representations, analyzing different

aspects of the language. Whereas natural

language generation it is the process of

producing the meaningful phrases and

sentence in the form of natural

language. It involves text planning,

sentence planning and text realization.

Now NLU is usually considered harder

than NLG. Now you might be thinking that

even a small child can understand a

language. So let's see what are the

difficulties a machine faces while

understanding any particular languages.

Now understanding a new language is very

hard. Taking our English into

consideration, there are a lot of

ambiguity and that too in different

levels. We have lexical ambiguity,

syntactical ambiguity and referential

ambiguity. So lexical ambiguity is the

presence of two or more possible

meanings within a single word. It is

also sometimes referred to as semantic

ambiguity. For example, let's consider

these sentences and let's focus on the

italicized words. She is looking for a

match. So what do you infer by the word

match? Is it that she looking for a

partner or is it that she's looking for

a match be it a cricket match or a rugby

match? Now the second sentence here we

have the fisherman went to the bank. Is

it the bank where we go to collect our

checks and money or is it the river bank

we are talking about here. Sometimes it

is obvious that we are talking about the

river bank but it might be true that

he's actually going to a bank to

withdraw some money. You never know. Now

coming to the second type of ambiguity

which is the syntactical ambiguity in

English grammar. This syntactical

ambiguity is the presence of two or more

possible meanings within a single

sentence or a sequence of words. It is

also called as structural ambiguity or

grammatical ambiguity. Taking these

sentences into consideration, we can

clearly see what are the ambiguities

faced. The chicken is ready to eat. So

here what do you infer? Is the chicken

ready to eat his food or is the chicken

ready for us to eat? Similarly, we have

the sentence like visiting relatives can

be boring. Are the relatives boring or

when we are visiting the relative it is

very boring? You never know. Coming to

the final ambiguity which is the

referential ambiguity. Now this

ambiguity arises when we are referring

to something using pronouns. The boy

told his father the theft he was very

upset. Now I'm leaving this up to you.

You tell me what does he stand for here?

Who is he? Is it the boy? Is it the

father or is it the thief?

So coming back to NLP. Firstly, we need

to install the NLTK library that is the

natural language toolkit. It is the

leading platform for building Python

programs to work with human language

data and it also provides easy to use

interfaces to over 15 corpora and

lexical resources. We can use it to

perform functions like classification,

tokenization, stemming, tagging and much

more. Now once you install the NLTK

library, you will see an NLTK

downloader. It is a pop-up window which

will come up and in that you have to

select the all option and press the

download button. It will download all

the required files the corpora the

models and all the different packages

which are available in the NLCK. Now

when we process text there are a few

terminologies that we need to

understand. Now the first one is

tokenization. So tokenization is a

process of breaking strings into tokens

which in turn are small structures or

units that can be used for tokenization.

Now tokenization involves three steps

which is the breaking a complex sentence

into words understanding the importance

of each words with respect to the

sentence and finally produce a

structural description on an input

sentence. So if we have a look at the

example here considering this sentence

tokenization is the first step in NLP.

Now when we divide it into tokens, as

you can see here, we have 1 2 3 4 5 6

and seven tokens here. Now NLTK also

allows you to tokenize phrases

containing more than one word. So let's

go ahead and see how we can implement

tokenization using NLTK. So here I'm

using Jupyter notebook to execute all my

practicals and demo. Now you are free to

use any sort of IDE which is supported

by Python. It's your choice. So let me

create a new notebook here. Let me

rename as text mining and NLP.

So first of all let us import all the

necessary libraries. Here we are

importing the OS NLTK and the NLTK

corus.

So as you can see here we have various

files which represent different types of

words, different types of functions. We

have samples of Twitter.

We have different sentimental word net.

We have product reviews. We have movie

reviews. We have non-breaking prefixes

and many more files here.

Now let's have a look at the Gutenberg

file here and see what are all the

fields which are present in the

Gutenberg file. So as you can see here

inside this we have all the different

types of text files. We have Austin,

Emma. We have the Shakespeare, we have

the Hamlet, we have Mobex, we have the

Carol Alice and many more. Now this is

just one file we are talking about and

NLTK provides a lot of files. So let's

consider a document of type string and

understand the significance of its

tokens. So if you have a look at the

elements of the Hamlet, you can see it

starts from the tragedy of Hamlet by

William SPE.

So if you have a look at the first 500

elements of this particular text file.

So as I was saying the tragedy of Hamlet

by William Shakespeare 1599 actor's

premise. We can use a lot of these files

for analysis and text for understanding

and analysis purposes and this is where

NLTK comes into picture and it helps a

lot of programmers to learn about the

different features and the different

application of language processing. So

here I have created a paragraph B on

artificial intelligence. So let me just

execute it. Now this AI is of the string

type. So it will be easier for us to

tokenize it. Nonetheless, any of the

files can be used to tokenize. For

simplicity here, I'm taking a string

file. The next what we are going to do

is import the word tokenize under the

NLTK tokenize library. Now this will

help us to tokenize all the words. Now

we will run the word tokenize function

over the paragraph and assign it a name.

So here I'm considering AI tokens and

I'm using the word tokenize function on

it. Let's see what's the output of this

AI tokens.

So as you can see here it has divided

all the input which was provided here

into the tokens.

Now let's have a look at the number of

tokens here we have here. So in total we

have 273 tokens. Now these tokens are a

list of words and the special characters

which are separated items of the list.

Now in order to find the frequency of

the distinct elements here in the given

AI paragraph, we are going to import the

frequency distinct function which falls

under NLTK.probability.

So let's create a f test in which we

have the function here frequentist.

And basically what we are doing here is

finding the word count of all the words

in the paragraph.

So as you can see here we have comma 30

times we have full stop nine times and

we have accomplished one according one

and so on. We have computer five times.

Now here we are also converting the

tokens into lower case so as to avoid

the probability of considering a word

with upper case and lower case as

different. Now suppose we were to select

the top 10 tokens with the highest

frequency. So here you can see that we

have comma 30 times the 13 times of 12

times and 12 times. Whereas the

meaningful words which are intelligence

which is six times and intelligence six

time. Now there is another type of

tokenizer which is the blank tokenizer.

Now let's use the blank tokenizer over

the same string to tokenize the

paragraph with respect to the blank

string. Now the output here is nine. Now

this nine indicates how many paragraphs

we have and what all paragraphs are

separated by a new line. Although it

might seem like a one paragraph, it is

not. The original structure of the data

remains intact. Now another important

key term in tokenizations are biograms,

diagrams and engrams. Now what does this

mean? Now biograms refers to tokens of

two consecutive words known as a bagram.

Similarly, tokens of three consecutive

written words are known as triagram. And

similarly, we have engrams for the n

consecutive written words. So, let's go

ahead and execute some demo based on

bagrams, diagrams, and engrams. So,

first of all, what we need to do is

import biograms, diagrams, and engrams

from nltk.util.

Now, let's take a string here on which

we'll use these functions. So taking

this string into consideration, the best

and the most beautiful thing in the

world cannot be seen or even touched.

They must be felt with the heart. So

first what we are going to do is split

the above sentence or the string into

tokens. So for that we are going to use

the word tokenize. So as you can see

here we have the tokens. Now let us now

create the bagram of the list containing

tokens. So for that we are going to use

the nltk.bs biograms and pass all the

tokens and since it is a list we are

going to use the list function. So as

you can see under output we have the

best best and and most beautiful thing

in the world. So as you can see the

tokens are in the form of two words it's

in a pair form. Similarly if we want to

do the triagrams and find out the

triagrams what we need to do is just

remove the bagrams and use the

triagrams.

So as you can see we have tokens in the

form of three words and if you want to

use the engrams let me show you how it's

done. So for engrams what we need to do

is define a particular number here. So

instead of n I'm going to use let's say

four. So as you can see we have the

output in the form of four tokens.

Now once we have the tokens we need to

make some changes to the tokens. So for

that we have stemming. Now stemming

usually refers to normalizing words into

its base form or the root form. So if we

have a look at the words here we have

affectation affects affections

affected, affection and affecting. So as

you might have guessed the root word

here is affect. So one thing to keep in

mind here is that the result may not be

the root word always.

Seming algorithm works by cutting off

the end or the beginning of the word

taking into account a list of common

prefixes and suffixes that can be found

in an infected word. Now this

indiscriminate cutting can be successful

in some occasions but not always. And

this is why we affirm that this approach

presents some limitations. So let's go

ahead and see how we can perform

stemming on a particular given data set.

Now there are quite a few types of stem.

So starting with the potter stem, we

need to import it from nltk. stem. Let's

get the output of the word having and

see what is the stemming of this word.

So as you can see we have have as the

output.

Now here we have defined words to stem

which are give, giving, given and gave.

So let's use the porter stemer and see

what is the output of this particular

stemming. So as you can see it has

given, give, given, give and gave. Now

we can see that the stemmer removed only

the ing and replaced it with an e. Now

let's try to do it the same with another

stemmer called the Lancaster stemmer.

You can see the stemmer stemmed all the

words. As a result of it, you can

conclude that the Lancaster stemmer is

more aggressive than the potter stemer.

Now the use of each of these stemmers

depend on the type of task that you want

to perform. For example, if you want to

check how many times the words GIV is

used above, you can use the Lancaster

stemmer. And for other purposes, you

have the Potter stemer as well.

Now, there are a lot of stemmers. There

is one snowball stemmer also present

where you need to specify the language

which you are using and then use the

snowball stemmer. Now, as we discussed

that stemming algorithm works by cutting

off the end or the beginning of the

word. On the other hand, lemitization

takes into consideration the

morphological analysis of the word. Now,

in order to do so, it is necessary to

have a detailed dictionary which the

algorithm can look into to link the form

back to its lema. Now, limitization what

it does is groups together different

infected forms of a word which are

called lema. It is somehow similar to

stemming as it maps several words into a

common root. Now one of the most

important thing here to consider is that

the output of limitization is a proper

word unlike stemming in that case where

we got the output as GIV. Now GIV is not

any word it's just a stem. Now for

example if a limitization should work on

go on going and went it all stems into

go because that is the root of the all

the three words here. So let's go ahead

and see how lemitization work on the

given input data. Now for that we are

going to import the leatizer from NLTK.

Now we are also importing the word net

here. As I mentioned earlier that

lemitization requires a detail

dictionary because the output of it is a

root word which is a particular given

word. It's not just any random word. It

is a proper word. So to find that proper

word it needs a dictionary. So here we

are providing the word net dictionary

and we are using the word net leatizer.

So passing the word corpora into the

word net leatizer. So can you guys tell

me what is the output of this one? I'll

leave this up to you guys. I won't

execute the sentence. Let me remove this

sentence here. You guys tell me in the

comments below what will be the output

of the limitization of the word corpor.

And what will be the output of the

stemming? You guys execute that and let

me know in the comment section below.

Now let's take these words into

consideration. Give, giving, given and

gave and see what is the output of the

limitization.

So as you can see here the limitizer has

kept the words as it is and this is

because we haven't assigned any poss

tags here and hence it has assumed all

the words as nouns. Now you might be

wondering what are poss tags? Well, I'll

tell you what are poss tags later in

this video. So for just now let's keep

it as simple as that is that POS tags

usually tell us what exactly the given

word is. Is it a noun? Is it a verb? Or

is it different parts of speech?

Basically POS stands for parts of

speech. Now do you know that there are

several words in the English language

such as I, ate, for, above, below which

are very useful in the formation of

sentence and without it the sentence

would make any sense. But these words do

not provide any help in the natural

language processing and this list of

words are also known as stop words. NLTK

has its own list of stop words and you

can use the same by importing it from

the NLTK.cus.

So the question arises are they helpful

or not? Yes, they are helpful in the

creation of sentences but they are not

helpful in the processing of the

language. So let's check the list of

stop word in the NLTK.

So from nltk.corus we are importing the

stop words and if we specify what all

stop words are there in the English

language. Let's see. So as you can see

here we have the list of all the stop

words which are defined in the English

language and we have 179 total number of

stop words. Now as you can see here we

have these words which are few more most

other some. Now these words are very

necessary in the formation of sentences.

You cannot ignore these words but for

processing these are not important at

all. So if you remember we had the top

10 tokens from that particular word that

is the AI paragraph I mentioned earlier

which was given as F test top 10. Let's

take that into consideration and see

what you can see here is that except

intelligent and intelligence most of the

words are either punctuation or stop

words and hence can be removed. Now

we'll use the compile from the re module

to create a string that matches any

digit or special character and then

we'll see how we can remove the stop

words. So if you have a look at the

output of the post punctuation,

you can see there are no stop words here

in the particular given output. And if

you have a look at the output of the

length of the post punctuation, it's 233

compared to the 273 the length of the AI

tokens. Now this is very necessary in

language processing as it removes the

all the unnecessary words which do not

hold any much more meaning. Now coming

to another important topic of natural

language processing and text mining or

text analysis is the parts of speech.

Now generally speaking the grammatical

type of the word which is the verb,

noun adjective adverb article

indicates how a word functions in the

meaning as well as the grammatical

within the sentence. Now a word can have

more than one part of speech based on

the context in which it is used. For

example, if you take the sentence into

consideration, Google something on the

internet. Now here Google acts as a verb

although it is a proper noun. So as you

can see here we have so many types of

poss tags and we have the descriptions

of those various tags. So we have the

coordinating conjunction CC, cardinal

number CD. We have JJ as adjective, MD

as modal. We have the proper noun

singular pler. We have verbs, different

types of verbs. We have interjection

symbol. We have the Y pronoun and the Y

adverb. Now we can use P tags as a

statistical NLP task. It distinguishes

the sense of the word which is very

helpful in text realization and it is

easy to evaluate as in how many tags are

correct and you can also infer semantic

information from the given text. So

let's have a look at some of the

examples of pos. So take the sentence

the dog killed the bat. So here the is a

determiner dog is a noun killed is a

verb and again the bat are determiner

and noun respectively. Now let's

consider another sentence. The waiter

cleared the plates from the table. So as

you can see here all the tokens here

correspond to a particular type of tag

which is the parts of speech tag. It is

very helpful in text realization. Now

let's consider a string and check how

NLTK performs POS tagging on it. So

let's take the sentence Timothy is a

natural when it comes to drawing. First

we are going to tokenize it. And under

NLTK only we have the poss tag option.

And we'll pass all the tokens here. So

as you can see we have Timothy as noun

is a verb or as a determiner natural as

an adjective when as a verb it as a

preposition comes as a verb to as a to

and drawing as a verb again. So this is

how you define the poss tags.

The poss tag function does all the work

here. Now let's take another example

here. John is eating a delicious cake.

And let's see what's the output of this

one. Now here you can see that the

tagger has tagged both the word is and

eating as a verb because it has

considered is eating as a single term.

This is one of the few shortcomings of

the POS taggers. One thing important to

keep in mind. Now after poss taggings

there is another important topic which

is the named entity recognition. So what

does it mean? Now the process of

detecting the named entities such as the

person name, the location name, the

company name, the organization, the

quantities and the monetary value is

called the named entity recognition. Now

named entity recognition we have three

types of identification. Here we have

the nonphrase identification. Now this

step deals with extracting all the noun

phrases from a text using dependency

passing and parts of speech tagging.

Then we have the phrase classification.

The step classification. This is the

classification step in which all the

extracted noun phrases are classified

into respective categories which are the

location, names, organization and much

more. And apart from this, one can

curate the lookup tables and

dictionaries by combining information

from different sources. And finally we

have the entity disambiguation. Now

sometimes it is possible that the

entities are mclassified. Hence creating

a validation layer on top of the result

is very useful and the use of knowledge

graphs can be exploited for this

purpose. Now the popular knowledge

graphs are Google knowledge graph the

IBM Watson and Wikipedia.

So let's take a sentence into

consideration that the Google CEO Sunda

Pichai introduced the new pixel at

Minnesota Roy center event. So as you

can see here Google is an organization,

Sundap Pay is a person, Minnesota is a

location and the Roy center event is

also tagged as an organization. Now for

using any in Python we'll have to import

the NE chunk from the NLTK module which

is present in Python. So let's consider

a text data here and see how we can

perform the ne using the NLTK library.

So first we need to import the NE chunk

here. Let's consider the sentence here.

We have the US president stays in the

white house. So we need to do all these

processes again. We need to tokenize the

sentence first and then add the POS tax.

And then if we use the any chunk

function and pass the list of tpples

containing POS tax to it. Let's see the

output. So as you can see the US here is

recognized as an organization and white

house is clubed together as a single

entity and is recognized as a facility.

Now this is only possible because of the

poss tagging. Without the POS tagging it

would be very hard to detect the named

entities of the given tokens.

Now that we have understood what are

named engineary recognition and yes,

let's go ahead and understand one of the

most important topic in NLP and text

mining which is the syntax.

So what is a syntax?

So in linguistics syntax is the set of

rules, principle and the processes that

govern the structure of a given sentence

in a given language. The term syntax is

also used to refer to the study of such

principles and processes. So what we

have here are certain rules as to what

part of the sentence should come at what

position. With these rules, one can

create a syntax tree whenever there is a

sentence input. Now syntax tree in

layman terms is basically a tree

representation of the syntactic

structure of the sentence of the

strings. It is a way of representing the

syntax of a programming language as a

hierarchical tree structure. This

structure is used for generating symbol

tables for compilers and later code

generation. The T represents all the

constructs in the language and their

subsequent rules. So let's consider the

statement the cat sat on the mat. So as

you can see here the input is a sentence

or a w phrase and it has been classified

into non-phrase. Then the prepositional

phrase again the noun phrase is

classified into article and noun and

again we have the verb which is sat. And

finally we have the preposition on the

article and the noun which are the and

matt. Now in order to render syntax

trees in our notebook you need to

install the ghost strip which is a

rendering engine. Now this takes a lot

of time and let me show you from where

you can download the ghost script. Just

type in download ghost script and

select the latest version here.

So as you can see we have two types of

license here. We have the general public

license and the commercial license as

creating syntax and following it is a

very important part. It is also

available for commercial license and it

is very useful. So I'm not going to go

much deeper into what syntax tree is and

how we can do that. So now that we have

understood what are syntax trees, let's

discuss the important concept with

respect to analyzing the sentence

structure which is chunking. So chunking

basically means picking up individual

pieces of information and grouping them

into bigger pieces. And these bigger

pieces are also known as chunks. In the

context of NLP and text mining, chunking

means grouping of words or tokens into

chunks. So let's have a look at the

example here. So the sentence into

consideration here is we caught the

black panther. We is the preposition

caught is a verb. the determiner. Black

is an adjective and panther is a noun.

So what it has done is here as you can

see is that pink which is an adjective,

panther which is a noun and the is a

determiner are chunked together in the

noun phrase.

So let's go ahead and see how we can

implement chunking using the NLTK.

So let's take the sentence the big cat

ate little mouse who was after the fresh

cheese. We'll use the POS tax here and

also use the tokenizing function here.

So as you can see here we have the

tokens and we have the PS tags. What

we'll do now is create a grammar from a

noun phrase and we'll mention the tags

that we want in our chunk phrase within

the curly braces. So that will be our

grammar np.

Now here we have created a regular

expression matching string. Now we'll

now have to pass the chunk and hence

we'll create a chunk pass and pass our

noun phrase string to it. So as you can

see we have a certain error and let me

tell you why this error occurred. So

this error occurred because we did not

use the code script and we do not form

the syntactical tree. But in the final

output we have a tree tree structure

here which is not exactly in the

visualization part but it's there. So as

you can see here we have the NP noun

phrase for the little mouse. Again we

have the noun phrase for fresh cheese

also.

Although fresh is an adjective and

cheese is a noun, it has considered a

noun phrase of these two words. So this

is how you execute chunking in NLTK

library. So by now we have learned

almost all the important steps in text

processing and let's apply them all in

building a machine learning classifier

on the movie reviews from the NLTK

corpora. So for that first let me import

all the libraries

which are the pandas the numpy library.

Now these are the basic libraries needed

in any machine learning algorithm.

We are also importing the count

vectorizer. I'll tell you why it is used

later. Now let's just import it for now.

So again if we have a look at the

different elements of the corpora as we

saw earlier in the beginning of our

session we have so many files in the

given NLTK corpora. Now let's now access

the movie reviews corporates under the

NLTK corpora. As you can see here we

have the movie reviews. So for that we

are going to import the movie reviews

from the NLTK corpor. So if you have a

look at the different categories of the

movie reviews we have two categories

which are the negative and the positive.

So if you have a look at the positive we

can see we have so many text files here.

Similarly if we have a look at the

negative we have thousand negative files

also here which have the negative

feedbacks.

So let's take a particular positive one

into consideration which is the

cv0029590.

You can take any one of the files here

doesn't matter. Now the above

tokenization as you can see here the

file is already tokenized but it is

generally useful for us to do the

tokenization but the above tokenization

has increased our work here and in order

to use the count vectorzer and the TF

IDF we must pass the strings instead of

the tokens. Now in order to convert the

strings into token we can use the d

tokenizer within the nltk but uh that

has some licensing issues as of now with

the with the cond environment. So

instead of that we can also use the join

method to join all the tokens of the

list into a single string and that's

what we are going to use here. So first

we are going to create an empty list and

append all the tokens within it. We have

the review list that is an empty list.

Now what we are going to do here is

remove all the extra spaces the commas

from the list while appending it to the

empty list and perform the same for the

positive and the negative reviews. So

this one we are doing it for the

negative reviews and then we'll do the

same for the positive reviews as well.

So if you have a look at the length of

this negative review list, it's 1,000.

And the moment we add the positive

reviews also, I think the length should

reach 2,000.

So let me just define the positive

reviews.

Now execute the same for positive

reviews. And then again, if we have a

look at the length of the review list,

it should be 2,000. That is good. Now

let us now create the targets. before

creating the f features for our

classifiers.

So while creating the targets we are

using the negative reviews here we are

denoting it as zero and for the positive

reviews we are converting it into one

and also we will create an empty list

and we'll add 1 th00and zeros followed

by th00and ones into the empty list.

Now we'll create uh panda series for the

target list. Now the type of Y must

result into a panda series. So if you

have a look at the output of the type of

Y, it is pandas.code or series. That is

good. Now let's have a look at the first

five entries of the series.

So as you can see it is th00and zeros

which were followed by th00and ones. So

the first five inputs are all zeros. Now

we can start creating features using the

count vectorzer or the bag of words. For

that we need to import the count

vectorzer. Now once we have initialized

the vectorzer now we need to fit it onto

the rev list. Now let us now have a look

at the dimensions of this particular

vector. So as you can see it's 2,000 by

16,228.

Now we are going to create a list with

the names of all the features by typing

the vectorzer name. So as you can see

here we have our list. Now what we'll do

is we'll create a pandas data frame by

passing the sci matrix as values and

feature names as the column names. Now

let us now check the dimension of this

particular pandas data frame. So as you

can see it's the same dimension 200 by

16,228.

Now if we have a look at the top five

rows of the data frame. So as you can

see here we have 16,228

columns with five rows and all the

inputs are here zero. Now the data frame

we are going to do is now split it into

training and testing sets and let us now

examine the training and the test sets

as well. So as you can see the size here

we have defined as 0.25

that is the test set that is 25% the

training set will have the 75% of the

particular data frame. So if you have a

look at the shape of the X train, we

have 15,000.

And if you have a look at the dimension

of X test, this is 5,000.

So now our data is split. Now we'll use

the nave bias classifier for text

classification over the training and

testing sets.

So now most of you guys might already be

aware of what a nave bias classifier is.

So it is basically a classification

technique based on the base theorem with

an assumption of independence among

predictors. In simple terms, a nave bi

classifier assumes that the presence of

a particular feature in a class is

unrelated to the presence of any other

feature. To know more, you can watch our

nave by bice classifier video, the link

to which is given in the description box

below. If you want to pause at this

moment of time and check quickly what a

nave by classifier does and how it

works, you can check that video and come

back here. Now to implement nave bias

algorithm in python we'll use the

following library and the functions. We

are going to import the gshian nb from

sklearn library which is a scikitlearn.

We are going to instantiate the

classifier now and fit the classifier

with the training features and the

labels.

We are also going to import the

multinnomial nave bias because we do not

have only two features here. We have the

multinnomial features. So now we have

passed the training and the test data

set to this particular multinnomial navy

bias and then we will use the predict

function and pass the training features.

Now let's have a look and check the

accuracy of this particular metrics. So

as you can see here the accuracy here is

one that is very highly unlikely but

since it has given one that means it is

overfitting and it is overly accurate.

And you can also check the confusion

matrix for the same. For that what you

need to do is use the confusion matrix

on these variables which is y test and y

predicted. So as you can see here

although it has predicted 100% accuracy

the accuracy is one. This is very highly

unlikely and you might have got a

different output for this one. I've got

the output here as 1.0. You might have

got an output as 0.6 0.7 or any number

in between 0 and 1. [music]

Now let's answer the fundamental

question that is what exactly is

generative AI? Generative AI refers to

algorithm capable of creating new

content whether text, images, audio or

even videos. It's like having a creative

AI assistant that can take a simple

input and produce engaging outputs. For

example, GPT and llama can write essays

or code while image generation models

like DAL E and stable diffusion can

visualize unique scenes from

descriptions. But let's look at some of

the popular tools driving this

innovation. Well, some of the standard

tools in generative AI includes GitHub

copilot which assists developers with

code suggestions and charged

interactions.

Image generation tools like stable

diffusion and midjourney helps creators

bring visual concepts to life. Google's

Gemini merges text and image

capabilities while Adobe Firefly extends

AI's reach to creative source. So if you

want to know how to use these tools then

check out our generative AI examples

video link in the description. So you

might wonder where are these tools being

applied. Now let's explore them.

Generative AI is transforming multiple

creative fields. Image generation tools

power visual design. Music compositions

algorithm create original scores and AI

assist video editors in automatic task.

LLMs help generate and translate text

while code generation tools like GitHub

copilot boost developer efficiency. AI

generated voices are even being used in

audio books and voice assistants. So now

let's take some of these tools and

check. So this time we will use Ptory AI

and Flicky AI. First let's explore Ptory

AI. So for that let's go to its site and

check its functions. So we are at the

Ptory AI site. And on the left side we

have the home project and brand kits.

And on the main screen we have different

features Ptory AI provides. So let's

choose text to video. Here let's write

some names and description and press

generate.

Pictory AI is a tool designed for video

creators that helps transform long- form

content such as articles or blog post

into short engaging videos. It uses AI

to automatically extract key highlights

and create professionallook videos with

minimal efforts. Due to its simplicity

and time-saving capabilities, Pictory AI

is a popular for social media content

creation and marketing. Now let's see

our next tool which is Flicky AI. So now

we are at the flicky.ai site. Here we

have different features like videos

where you can create videos from all of

these blogs, prompts etc. You can also

create audios from these features and

then we also have a design feature and

on the left hand side you can see

options like files, templates, brand

kits, voice clones etc. So now let's

take an idea and convert it into a

video. Now let's write our topic and

generate. Flicky AI is a content

creation tool that turns text into

videos using AI generated voices and

visuals. It helps users create

professional videos quickly by pairing

written content with the stock images,

animations, and voiceovers. Flicky is

ideal for marketers, content creators,

and educators looking to create engaging

video content efficiently. Now that we

have seen the applications, so let's

step back and look at the journey that

brought us here. So basically our

journey starts in 1947 with Alan

Turing's concept of intelligent

machines. By 1961, Joseph Venbomb

introduced Ela, the first chatbot. The

1980s saw the birth of recurrent neural

networks, while 1997 brought long

short-term memory networks to tackle

sequential data. And then GANs emerged

in 2014 transforming creative task. Fast

forward to 2017 when transformers like

GPT entered the scene. By 2023, GPT 3.5

and Google's palm marked significant

milestones. And by 2025, we are on the

brink of AI breakthroughs in chemistry

and genome editing. So what exactly are

these LLMs and why are they so powerful?

An LLM or large language models analyzes

and understands natural language using

machine learning. Examples include

OpenAI's GPT, Google Spetas Lama. These

models drive applications such as

chatbots, language translation and more

by learning from extensive data to

predict and generate text sequences. But

before this, there was a very famous

term called language model. A language

model is a machine learning model that

uses probability statistics and

mathematics to predict the next sequence

of words. Suppose you have a sentence

like I have a boy who is my dash. Here

if we ask a language model to predict

the next word, it considers the context

provided by the words before the blank.

Based on common usage patterns from its

training data, it may predict words like

boyfriend, brother, or friend which fit

naturally. However, it's less likely to

predict colleague or sibling as those

words may not commonly follow this type

of phrases. So, this process shows how

language models predict text by

calculating probabilities for each

possible word based on their likelihood

in context.

So, when a language model is trained on

massive amounts of diverse text, it

gains a wider vocabulary and more

understanding of language, enabling it

to make more accurate predictions. For

example, if we give it a phrase like you

are a dash to me, a model trained on

extensive data might suggest various

fitting words for example friend,

inspiration or anything else. So based

on the sentiment or context it has

learned from the data. Now here

reinforcement learning is used to

improve the model's responses over time.

By giving feedbacks be it positive or

negative on the responses we help the

model learn which type of responses are

preferred in specific context. For

example, if the model frequently

misinterprets the tone or intent, the

reinforcement learning helps adjust its

productions to be more contextually

appropriate and aligned with the

intended meaning. But what do these

models look like under the hood?

Well, LLMs are built on neural networks

composed of input, hidden, and output

layers. The hidden layer process

information to learn complex patterns

and more layers means the model can

capture deeper insights. This structure

allows LLM to perform task from

generating text to complex code

completions. Now, how do these layers

interact and function in real time? Now,

LLM is based on the transformer and a

transformer uses deep learning to

process any information coming to it.

Now, let me tell you a story of three

friends. Imagine we have three

characters. First is our friend. The

next character is Minion Bob and the

third character is Gru. So, our friend

asks Bob, "What's the price of the jet?

It must be $50,000." Minion Bob isn't

sure. So he goes to Gru and asks, "Is

the jet $50,000?" Grrew replies, "No,

it's $70,000."

In this back and forth, Minion Bob is

like the neural network layer trying to

make an accurate guess. So each time he

goes back to group, like receiving more

data or feedback, he gets corrected if

his guess is wrong, leading him to

refine his response. Now after the first

check, Min and Bob returns to our friend

saying, "I guess it's more than

$60,000."

Our friend assumes it might be around

$65,000. And sends Bob back to Gru to

verify. Again, Gru corrects him, "No,

it's actually $70,000."

So this process repeats with Bob

adjusting his guess each time.

Eventually, he learns that the correct

answer is $70,000 and updates his

knowledge. So just like Minion Bob

neural networks make initial guesses

based on available information with each

feedback loop like Bob going back to

group the model's hidden layers adjust

the parameters to refine its guesses

ultimately arriving at the most accurate

prediction possible. So after getting

corrected multiple times minion Bob's

guesses improve until he knows the price

is $70,000.

Similarly in a neural network gradually

learning the correct answer through

training. So once the network learns it

can give accurate answers in future

cases without checking every time. Now

let us move on to understand how LLMs

work. LLMs begin the collection of data

sets then tokenize text and break it

into a manageable pieces. Using a

transformer architecture they process

the data sequence all at once leveraging

vast training data. LLMs contain

millions of learned parameters that

predict the text tokens and generate

coherent outputs. Models often undergo

pre-training for general knowledge and

fine-tuning for specific task. So now

let's see some practical uses of LLMs.

LLM power content generation, creating

anything from articles to code. They

excel in language translation, enhanced

search engines, personalized

recommendation, code development

assistance, and sentiment analysis,

which also owe much to LLM's predictive

capabilities. So guys, are you ready to

use all that knowledge in coding and

witness how these LLMs come together to

drive innovation? Whether through

developing applications, analyzing data

or building smart assistance, the gear

of technology keep turning to unlock

AI's full potential. So now let us look

at our problem statement. So one of the

difficulties in the healthcare industry

is effectively evaluating medical

pictures such as MRIs, CT scans and

X-rays in order to identify anomalies

and illnesses. This procedure takes a

lot of time and calls from specialized

understanding. Automated methods must be

developed to help medical personnel

recognize possible health problems in

medical imaging. In order to provide

better patient care, a system that

integrates cuttingedge machine learning

models with image analysis can greatly

help in the early detection of diseases

including cancer, infections, and other

illnesses. So, the method uses

generative AI to evaluate medical photos

and generate a thorough diagnosis report

based on the findings. This technology

allows users to upload medical images

which the AI model then processes.

Now let us build our project on a

medical image analysis application using

streamlit Python and an LLM of Google

Gemini AI. So this app helps healthcare

professional analyze medical images such

as X-rays, MRIs and CT scans to detect

anomalies and diseases. First let's

import the necessary libraries. So first

import

streamllet as st.

So if this is not working or showing an

error then open the terminal and write

pip install streamllet

and from path lib import path.

Next import google.generative generative

AI as gen AI.

So we are importing streamllet for the

app interface and path from path li for

handling file paths and Google

generative AI which allows us to

interact with the Gemini AI model. Next

we will configure Google's Gemini API by

setting up our API key. So this will

allow us to connect to the AI model and

generate insights from medical images.

So before proceeding let's get our API

key and we will go to the Google to

generate an API key.

So on your left there is an API key

option and after clicking you will get

the create API option. So just select

your model and create your API key.

So as you can see the screen just copy

this API key and go back to terminal. So

now let's configure our model. So just

type gel AI dot configure

and inside the bracket give API key

equal to and over here paste the key.

Now we set up the system prompt which

defines the role of the AI model. So the

prompt specifies that our AI is a

medical image analysis system capable of

detecting diseases like cancer,

cardiovascular issues, neurological

conditions and more. So guys I have

already researched the prompts and

written here. So basically the system

prompt should be inside the triple

quotes.

So this prompt guides the model to

analyze medical images for conditions

such as cancer, fractures, infections

and more making it a valuable tool for

healthcare professionals. Now let's

configure the model settings for

generating responses. We define

parameters like temperature and top K to

control the creativity of the model's

output. First type generation_config

equal to

and inside the double quote we will give

temperature

which is 1 then

top_p

which is 0.95

next top k 40

then max output tokens which is 8192 two

next response_m

type which is of text or plane.

So over here the temperature one that

controls randomness a value of one given

balanced output diversity. Next top P

0.95 uses nucleus sampling selecting

tokens from the top 25% cumulative

probability distribution for diverse

responses. Next, the top K 40 which

limits token selection to the top 40

tokens based on the probability.

Narrowing possible outputs to high

probability tokens. Next, max output

token. This setting allows for longer

responses by limiting the maximum length

of the generated text to 8,192 tokens.

And then we have response m type which

specifies the format of the output as

plain text. So for more information,

read the Google Gemini documentation.

Next, we will also configure safety

settings to ensure that the model

doesn't generate harmful content. So for

example, we block categories like

harassment, hate speech, and sexual

explicit content. Here we are using two

things. First categories and then the

threshold. Then copy this four times

like harassment, hate speech and sexual

explicit content. Now let's set up the

layout for our streamlit application. So

for that we will configure the title and

the layout of the page and even add a

logo to make the interface more user

friendly. So first type st dot set

page_config

and inside the bracket let's give page

title equal to and inside the double

quotes we will give diagnostic analytics

comma page icon equal to

robot.

Now let us type column 1 comma column 2

comma column 3 is equal to st dot

columns and inside the bracket give 1a

2a 1

1 2a 1 next with column 2 so I'll be

using edurea and medical images so this

will show you how to set up images using

streaml now type st dot image and give a

bracket and inside the double quotes

let's type edurea dotpng

and give a comma and give width is equal

to 200.

Now let us copy and paste it for

medical.

So let's type medical.png.

Here we are using streamllets column to

center the logo and title and this makes

the app look professional and visually

appealing. Next, let's allow the user to

upload medical images for analysis. So,

we use streamlits file uploader widget

to accept image file in PNG, JPG or JPEG

formats. For that, let's type upload

file equal to ST dot file

uploader and inside the bracket inside

the double quotes, let's type please

upload the medical images for analysis.

Comma

type is equal to so basically the image

type is equal to and inside the bracket

inside the double quotes let's give PNG

comma jpg and jpeg

next let us type submit button equal to

stutton

and inside the bracket let's give

generate image analysis is

so here when the user uploads a file and

click the generate image analysis button

the model process the image and prepare

it for analysis. So once the user submit

the image we send it to the AI model for

analysis and then the model generates a

response based on the prompt and image

which we then display in the app. So

here as you can see the screen we have

another function. So the if submit

button which runs the code when the

submit button is pressed. Next the image

data is equal to upload file.get value.

This actually gets the raw image data

from the uploaded file. And next we have

the image parts where it creates a list

with the image data in a structured

format. Then we have the prompt parts.

So this combines the image data and a

text prompt for the model. So this part

of the code actually sends the image and

text prompt to the model to generate a

response. And then we have the st.right

which displays the model's responses in

the app. So here we use the image data

and system prompts to generate content

with the Gemini AI model. The result is

displayed as a detailed report with

insights about the medical image. Now

it's time to test the code. So open the

terminal and type streamllet run main.

py. So once you enter it will redirect

you to our model interface.

And there you go.

So the model is ready. So here's a live

demo of the app. We will upload a sample

image and the app will analyze it and

provide a detailed diagnosis based on

the AI models inside. So this is how we

use streamlit and Google's Gemini AI

model to create a medical image analysis

app. So this app can help medical

practitioners by offering precise and

thorough analysis of medical photos. Now

it is the time for testing. So let's

take one image of any disease and test

it. So upload the image from your

computer.

Then we will select an image and press

the generate button. So as you can see

it's running.

So it generates fabulous response and

can help doctors in assisting their

patients saving time and money. So this

is how we built a realtime metagnostic

helper using streamllet python and

Google gemini AI

[music]

lens like GP4 and Gemini 2.0 O are

massive models trained on huge data sets

capable of generating highly

sophisticated and nuanced responses. On

the other hand, SLMs like dist or tiny

GPT are smaller, more efficient models

designed for faster and more lightweight

task. So understanding the differences

between them is crucial for selecting

the right model for your needs. Now

let's dive right in with our first

question. What exactly are LLM and SLMs?

LLMs which are large language models are

powerful AI systems trained on vast data

sets offering deep contextual

understanding and sophisticated

responses. So models like GP4 and Gemini

2.0 are the examples whereas SLMs like

dist or tiny GPD are streamlined for

speed and efficiency exceling in

lightweight task. So both serve distinct

purposes balancing quality cost and

performance. All right. Now that we have

got a good idea of what LLMs and SLMs

are, let's talk about why this

comparison is so important. As AI

adoption grows across industries, the

choice between LLMs and SLMs becomes

more important. LLM offer deep

contextual understanding and complex

outputs while SLMs provide efficiency

and speed. So choosing the wrong model

can lead to excessive cost, slow

performance or supper results. And by

understanding the strengths and

trade-offs, you can make more informed

decision and optimize your AIdriven

solution. So now let's dive into the

core differences between LLM and SLMs

and see what sets them apart. So first

let us compare in terms of model size

and complexity. So when it comes to

model size and complexity, LLM often

have billions of parameters and require

vast computational resources to train

and run. Their large size enables them

to generate highquality context rich

responses. And on the other hand, SLMs

are designed with fewer parameters,

often in millions, making them lighter

and faster. They prioritize efficiency

over complexity, which makes them ideal

for simpler task. Next, let us compare

in terms of performance and output

quality. So when it comes to performance

and output quality, LLM are known for

their exceptional ability to handle

complex conversations, creative writing

and deep analysis. Their vast training

data ensures diverse and sophisticated

responses. On the other hand, while SLMs

are efficient, they may sometimes

struggle with nuanced or open-ended

queries. However, they excel in

straightforward well-defined task. Next,

let's compare them with speed and

latency. When it comes to speed and

latency, LLMs can experience longer

response time and higher latency due to

their large size, especially when

processing extensive input data. Whereas

SLMs are designed for speed, offering

quicker responses and making them well

suited for real-time applications where

low latency is fusion. Next, in terms of

cost and resource efficiency. So when it

comes to cost and resource efficiency,

LN require significant hardware

investments such as powerful GPUs and

extensive cloud resources which lead to

higher operational cost whereas SLMs

with the smaller footprints are more

affordable to deploy and maintain making

them accessible even with limited

computational resources. Now let us

explore the real world use cases of LLM

and SLM. LLMs are ideal for creative

content generation, customer service

chat bots with advanced capabilities,

deep data analysis, and long- form

conversations. On the other hand, SLMs

are perfect for lightweight virtual

assistance, real-time customer support,

simple automation and tasks that require

quick turnaround times. Now, let us see

its advantages and disadvantages of

using LLM and SLMs. The key advantages

of LLMs include their superior

understanding of complex language, the

ability to generate high quality nuance

responses and better generation across

wide range of diverse task. The main

drawbacks of LLMs are their high

computational and cost demands along

with slower response times due to their

large size and complexity. Now let us

have a look at the advantages of SLMs.

SLN offers several advantages including

their speed and efficiency, lower

operational cost and easier deployment

even on limited resources. The primary

disadvantages of SLMs are their limited

contextual understanding and their

tendency to struggle with complex

open-ended queries. Now that we have

explored the strengths and limitations,

so let's take a look at what the future

holds for LLMs and SLMs in AI

development. So both LLMs and SLMs will

play vital role in the future of AI. We

can expect ongoing improvements in

efficiency, quality and adaptability.

Hybrid approaches that combine the

strengths of both models could become

more common offering balanced

performance and scalability. So the

conclusion we get is that the choice

between LLM and SLM depends on your

specific needs. So if you prioritize

depth, nuance and high quality output,

LLMs are the best. So if speed,

efficiency, and cost are more important,

SLMs are the way to go. So by

understanding their strengths and

limitations, you can select the right

model and unlock AI's full potential for

your projects.

[music]

Have you seen how tools like chat GPT

with vision can look at an image you

upload and describe it? Or how Dali and

Midjourney can generate stunning images

from just a text prompt? And now some AI

models can even do both at the same

time. They can see, read, listen, and

even create all in one go. So how is

that possible? Well, that's because of

something called multimodel AI. AI that

doesn't just work with one type of data

like only text or only images, but can

understand and combine multiple types of

information together just like we humans

do. So, in this video, we are going to

break down what multimodel AI really

means and how multimodel AI works and

explore some amazing real world examples

that you are probably already using

without even realizing it. So first

let's break down the word multimodel. So

multi means many and model refers to the

modes of information like text, images,

sound or video. So multimodel AI is an

AI that can understand and work with

multiple types of data at the same time.

For example, a single AI model that can

read text, look at images, listen to

audio, watch videos, and combine all of

this to give a better answer. It sounds

a bit like how humans process

information, right? So, why do we need

multimodel AI? So, think about how we

interact with the world. If you're

watching a movie, you're seeing visuals,

listening to dialogue, and understanding

the story together. Or when you're

explaining a recipe to someone, you

might show pictures, describe steps, and

maybe even play a video. So, humans

naturally combine different scenes to

understand things. So, old AI models

were single model. They could only

process one type of data. Like a text

model could only read and write and a

vision model could only look at images.

But real world problems are not just

text or just images. They are mixed. So

multimodel AI bridges this gap and it

lets AI connect the dots between text,

visuals, audio and more. So how does

multimodel AI work? In simple terms, it

works like this. It takes different

types of input. For example, it could

take a photo and a text question about

that photo. Then it converts them into a

common language inside the AI model. So

think of it like translating text,

images and audio into one shared

understanding. Next, it reasons over all

the data together. Then it gives you a

smart answer that considers all the

inputs. So for example, you show AI a

picture of a dog and asks what breed is

this? So it looks at the image,

understands the features and responds

that looks like a golden retriever. So

it's combining vision plus language to

answer. Now let us go through a working

diagram of a full multimodel pipeline.

So as you can see the screen first it

takes different inputs. It could be a

text image or even a video. Then it

encoders for each modality. Later these

inputs will be translated into common AI

language. Then a multimodel transformer

uses cross attention to connect

relationship across text, images and

audio. And finally the model generates a

response. So let me take another example

to explain this diagram. So as you can

see we have different inputs. So the

model can take text, images, audio or

even video as input. Next is the

encoders for each modality. That means a

text encoder converts words into vectors

and an image encoder converts pixels

into vectors and then an audio encoder

converts sound waves into vectors.

Next is the shared embedding space where

all these different inputs are

translated into a common AI language

which is a vector space where similar

meanings are close together. For

example, the word car and a picture of a

car are mapped close together. Next is

the fusion plus reasoning layer where a

multimodel transformer uses cross

attention to connect relationship across

text, images and audio. For example, it

links the word red to the red region of

the car image. Next is the output

generation. So finally, the model

generates a response which could be

text, a caption, an image like deli or

even sound. All right, I hope this is

clear now. So now let's look at some

real world examples that make it easier

to understand. So first we have chat GPD

with vision. So if you upload an image

to chat GPD and ask what's in this

picture then it can describe the objects

text or even analyze data like a chart.

So that's multimodel AI. It's using both

image understanding and text generation

together. The next example is Google

Lens. So when you point your camera at

something, Google Lens can recognize the

object, read the text in the image and

translate into another language. Again,

it's a vision plus language plus

translation all in one model. The next

example could be a self-driving cars.

So, autonomous cars like Teslas uses

multimodel AI because they have to see

the road through cameras, read traffic

signals, hear alerts and also process

maps and text instructions. So, they

combine all these modes to make driving

decisions. Next is the healthcare AI. So

doctors now use AI that can look at

medical images like X-rays and also read

patient reports combining the

information to help diagnose diseases

more accurately. But why is multimodel

AI a gamecher? Multimodel AI is powerful

because it's closer to human

intelligence. We don't rely on one

sense, we combine many. And it makes AI

more flexible because one model can

handle text, images, audio, and more. It

can solve more complex problems like

explaining what's happening in a video

or understanding a full conversation

with context. All right. Now, for those

of you want a bit more technical depth,

here's a quick peak behind the scene.

So, as I discussed earlier, a multimodel

AI uses transformer based models, the

same type of models behind GPT. So the

text, images and audio are all converted

into a common representations like a

shared language of numbers called

embeddings. For example, a picture of a

dog and the word dog are both mapped

into a similar space. So the AI knows

they mean the same thing. Then the model

can reason across all modalities

together and generate an output. A great

example is click from open AI which

connects images and text. Another is

Google Gemini designed from the ground

up as a truly multimodel model. So what

is the biggest challenge? So the

different types of data have different

formats and complexity. Combining them

efficiently without losing meaning is

still an ongoing research area. So it's

not just a magic. So it's smart design

that lets the AI translate everything

into one common understanding. Let's now

look at some of the most important

multimodel models, how they work and

where they are used. So here are the key

multimodel models. So first on the list

we have clip which is clip which stands

for contrastive language image

pre-training from open AI. So let's see

how it works. So it has two encoders, a

text encoder and an image encoder. So

both encoders map inputs into the same

emitting space. So during training it

learns this caption matches this image

and this caption does not match that

image. So it uses contrastive learning.

It pushes correct pairs closer and

incorrect pairs further apart. So here

is the working diagram. So it takes the

input be image or text and then it

encoders. So a image is a vision encoder

and for text is a text encoder then it

is shared to a embedding space and

finally it generates the output. So

let's have a look at the use cases. So

it is used in DALI and stable diffusion

to align text prompts with images. Next,

it is used in zeroshot classification

where you give it a photo of a dog

versus a photo of a cat and it

recognizes which one matches the image

without retraining and then it is used

in search where it finds images similar

to this caption. Next, moving on to

second model which is BLP2.

It stands for bootstrapping language

image pre-training. So, let us see how

it works. So, first it connects a frozen

vision encoder. for example, a clip or

vit with a frozen large language model

which is NLM. A query transformer acts

as a bridge where it converts visual

features into a language friendly

representation. So here is the working

diagram. The AI first looks at the image

and turns it into a features like

objects, color and shapes. Then a small

bridge model called Q former takes those

visual features and converts them into a

format the language model can

understand. Next the large language

model then reasons about the image

features just like it reasons about

text. And finally it generates a text

answer or a caption describing the

image. So vision encoder sees and Q

former translates and the LLM explains.

So let's have a look at the use cases.

So first it is used in visual question

answering for example what's in this

picture. Next in the image captioning

where it can give it like a man riding a

horse on a beach. Next in a chatboards

with vision for example where you upload

an image and ask questions. Okay. The

next model on a list is Flamingo from

Deep Mind. So let's have a look at its

working. So here's how it works. So

first it's a few short multimodel model.

It doesn't need huge fine-tuning for a

new task. And then it uses gated cross

attention layers to integrate image plus

text inside a frozen MLM and it can

reason across multiple images and a long

text sequence. So it looks at the image,

reads your question, connects both

through cross attention and then

explains it. So let's have a look at the

use cases. So it is used in multimodel

chat bots like look at these five images

and now answer this question. Next, it

is used in educational AI where it read

diagrams plus answer questions. Next, in

document understanding where it read

text plus images in a PDF and the next

multimodel on a list is palm E from

Google. So, here's how it works. So, as

you can see, this is the working

diagram. So first the AI gets both

visual input like a photo or a live

camera feed and the text instructions

like pick up the red apple on the table.

Next the vision transformer understand

what's in the image like objects, colors

and position and the palm language model

understands the instruction and reasons

about what needs to be done. So it's

combining both. The AI creates a

step-by-step action plan for the robot

like move forward, grab the red apple

and place it in the basket. So, here are

the use cases. It is used in robotics

like pick up the red apple on the table.

Next, it is used in real world reasoning

for embodied AI. Then, it is also used

in visual navigation task. And the next

multimodel is Google Gemini. So, here's

how it works. It's a natively multimodel

trained from scratch on text, images,

audio and video. So unlike clip which

aligns two encoders, Gemini has a single

model handling all modalities and it

uses joint training with cross

attention. So this is the working

diagram. So let me explain this. The AI

takes in all types of inputs at once

such as written text, pictures, sound,

and even video. Then instead of using

separate models for each type, it uses

one powerful transformer model that can

understand and combine all these inputs

together. And from that combined

understanding, it can give any kind of

output, a text answer, a generated image

or even an audio response. So basically

it understands everything together and

responds in any form you need. So let us

have a look at its use cases. It is used

in complex queries such as summarize

this video and create a chart. It is

used in advanced digital assistance and

also in future AR VR multimodel

applications. The next model is GPT40

from open AI. It's a optimized

multimodel model. It accepts text,

images, audio in real time and it uses

fused embedding and parallel processing

for speed and it works as a true

interactive assistant. So here are its

use cases. It is used in conversational

AI with vision plus audio and in

realtime assistance where you upload an

image and get explanation instantly and

also in accessibility tools for example

describe surroundings for visually

impaired users. So these models

represent different approaches to

multimodality.

Some align subreate encoders like clip.

Some bridge vision plus LLM like BLP2

and some are natively multimodel like

Gemini and GP4. So now let us see how

are multimodel models trained. So

training multimodel models is much more

complex than training single model

models. So first is the data set

alignment. So you need paired data sets

such as images plus captions, videos

plus transcripts and audio plus text. So

the challenge is the text and images

don't always align perfectly. Next is

the contrastive learning. So train a

model to pull matching pairs closer and

push non-matching pairs apart. For

example, image of a cat plus caption a

cat is a matching pair. Whereas image of

a cat plus caption as a a dog is not a

matching pair. Next is the masked

modeling. Mask parts of the input such

as image patches text tokens as model

predicts missing information. Then it

forces the model to reason across

modalities. For example, mask the object

in a caption a dash is sitting on the

table plus providing image. Next is the

fusion and cross attention training

where models like Flamingo or Gemini

train cross attention layers to

integrate modalities. It requires huge

compute clusters. Next is scaling loss

like LLM's multimodel models get better

with size and data diversity. Gemini

plus GPD40 trained on massive multimodel

corpora. So here are the training

requirements. You need to have high

quality pair data set, billions of

parameters and TPUs, GPUs for weeks or

months and advanced optimizations such

as mixed precision or shade training. So

why is true multimodel AI still hard?

It's because of data mismatch. Text is

sequential, images are spial, and audio

is temporal. So aligning them perfectly

is difficult. Next is limited high

quality data. So billions of image text

pairs exist but have noises and bias.

Next bias and fairness models learn

cultural and social biases from

multimodel data. For example,

stereotypes in images and captions. The

next challenge is compute cost. So

training needs huge GPU clusters. For

example, hundreds of A00 GPUs and

fine-tuning multimodel models is even

more expensive than text only. And the

final challenge is the evaluation

difficulty. So how do you measure

reasoning across modalities? So there's

no single easy benchmark. So while

multimodel AI is powerful, it's also

data hungry, compute heavy, and still

evolving.

So in simple words, the multimodel AI

can process and combine multiple types

of data such as text, images, audio, and

video. It's already in use, such as GPD

vision, Google Lens, self-driving cars,

and healthcare AI. It's a big step

towards AI that can understand the world

like humans do. So what do you think?

Will multimodel AI make AI more

humanlike? So drop your thoughts in the

comments.

>> [music]

>> So what is a transformer? Transformers

operate on a concept called sequencetose

sequence learning. Essentially they take

a sequence of tokens as an input and

predict the next token in the output. A

great example of this is language

translation. Imagine inputting good

morning in English and the transformer

process this and outputs the translation

in languages like Japanese, Korean or

German. The key is how it efficiently

process the relationship between words.

Since we know what a transformer is,

let's dig a bit deep about them. A

transformer has two primary components

encoder and decoder. Encoder identifies

relationship between parts of the input

sequence. Whereas the decoder uses these

relationships to generate the output

sequence. This division is what allows

transformers to handle task like text

translation or summarizations with

remarkable accuracy. Now that we have

the idea of transformers, let's discuss

how they evolve. Before transformers,

there were other neural networks like

RNN, recurrent neural networks invented

by David Rumlhur in 1986.

However, RNN's face significant

challenges. They would forget early

parts of the sequence as they processed

longer ones and couldn't handle

dependencies efficiently. Additionally,

RNN relied on recurrence which made them

inefficient and incapable of

parallelization.

Then came long short-term memory

introduced by Howeter and Smith Huber in

1997.

Long short-term memory improved by

remembering sequences for a longer

duration and addressing some of the

memory issues in RNN.

However, they were slow to train and

difficult to manage at scale. Finally,

transformers transformed neural

networks. First introduced in the

landmark paper, attention is all you

need. Transformers addressed all the

problems faced by RNNs and LSTMs. They

used a completely attention-based

mechanism, eliminating reliance on

recurrence. This made transformers

capable of remembering context

efficiently, training faster and being

parallelized, enabling multitasking and

significantly speeding up processes.

Now, let's discuss on the attention

mechanism. Think about the sentence.

This cat wants to jump on the box. The

attention mechanism identifies the most

relevant parts of this sentence like

cat, jump and box and focuses on these

elements while processing the data. Now

that we know how transformers have

evolved, now let's discuss their

architecture. A transformer consists of

two main components, an encoder and a

decoder. Each typically consisting six

layers. Inside the encoder, there is one

attention layer and one feed forward

layer. While the decoder contains two

attention layers and one feed forward

layer. The magic of parallelism comes

from how data is fed into the network.

In the attention layer, all the words

are processed simultaneously with each

word forming combinations with other in

the sentence. This allows the model to

capture relationships and context

efficiently. After processing in the

attention layer, the data is sent to the

feed forward layer where it is learned

layer by layer. The input to the encoder

and decoder are the raw input embeddings

which are numerical representation of

words. On top of these embeddings,

positional encodings are added to help

the model understand the position and

the order of each word in the sequence.

If we simplify embeddings, there are

essentially vector representation of

words in an n dimensional space. At the

top of the architecture there are two

layers of the output probabilities

converting the final output into the

form of human can understand. These

inputs are represented as vectors with

their length corresponding to the size

of vocabulary.

Now what truly makes transformer unique

is the inclusion of normalization layers

which normalize the output from sub

layers. Additionally, skip connections,

the dark arrows in the architecture,

forward critical information that

bypasses self attention of fed forward

layers directly to the normalization

layers. This ensures the model does not

forget important details and effectively

passes vital information further into

the network. Now, moving forward, let's

discuss why transformers are important.

Transformers are vital because they

utilize semi-supervised learning. They

are trained on massive unlabelled data

sets enabling them to generalize across

a wide range of task. Unlike older

models, transformers don't need to

process data sequentially. Their

attention mechanism allow them to focus

on the most relevant context which

significantly speeds up training.

Transformers revolutionize data

processing by eliminating the need to

handle data sequentially allowing for

parallel processing and significantly

enhancing efficiency. The attention

mechanism lies at the core of

transformers, enabling the model to

focus on the most relevant parts of the

input sequence and improving accuracy

and understanding of context.

Furthermore, transformers excel at

providing context, ensuring that the

meaning of each word or token is

accurately interpreted within its

surroundings. Lastly, these models

dramatically speeds up the training

process, making them faster and more

efficient compared to traditional neural

networks, thus redefining AI's

capabilities across diverse

applications. Now that we know why

transformers are important, let's

discuss some applications.

We have OpenAI's GPT, a groundbreaking

model that leverages the power of

transformers for natural language

processing task. Additionally, Google

has developed several transformer-based

models including vision transformer for

image recognition but birectional

encoder representations from

transformers for understanding the

context of words in a sentence and T5

which stands for texttoext transfer

transformer for a wide range of text

generation task. Microsoft has also

contributed with debirth decoding

enhanced bird with disentangled

attention a model designed to improve

contextually understanding and enhance

NLP applications. These models

demonstrate the versatility and impact

of transformer architecture across

various domains. Now that we know the

application of transformers, how about

checking their realtime products?

Transformers have become an integral

part of many real world products that we

use style. Examples include Grammarly

which leverages transformers for

advanced grammar and writing assistance,

Google search and its translation tools

powered by models like bird and t5 and

chat GPD open AI's conversational AI

that relies on the generative pre-chain

transformer architecture. Additionally,

Meta's deep fake detector uses

transformer-based models for facial

recognition task. These applications

highlight how transformers have

revolutionarized technology, seamlessly

integrating it into tools that enhance

our everyday lives. In conclusion,

transformers are changing the tech world

by enabling smarter, faster, and more

efficient AI systems. Whether it's

generating text, translating languages,

or enhancing search engines, these

models are the cornerstone of modern AI.

[music]

So what are RNN's, right? Well, RNN

basically stand for recurrent neural

network and we usually use this in order

to deal with a sequential data.

Sequential data can be something like a

time series data or a textual data of

any format. So why should one use RN and

write? But this is because there's a

concept of internal memory here. RNN can

remember important things about the

input it has received which allows them

to be very precise in predicting what

can be the next outcome. So this is the

reason why they are performed or

preferred on a sequential data

algorithm. Okay. And some of the

examples of sequential data can be

something like time series, speech,

text, financial data, audio, video,

weather and many more. Although RNN were

the state-of-the-art algorithm for

dealing with sequential data, they come

up with their own drawbacks. And some of

the popular drawbacks over here can be

like due to the complication or the

complexity of the algorithm the neural

network is pretty slow to train and as

there are huge amount of dimensions here

the training is very long and difficult

to do. Okay. Apart from that the most

decisive feature for RNN or for the

improvement in RNN is that of a

vanishing gradient. What this vanishing

gradient is is that you know when we go

deeper and deeper into our neural

network the previous data is lost. This

is because of a concept called as

vanishing gradient and due to this we

cannot work on a large or a longer

sequence of data. Okay. To overcome this

we came up with some new or upgrades to

the current recurrent neural networks or

RNN. Starting off with birectional

recurrent neural network. You see

birectional recurrent neural network

connect two hidden layers of opposite

direction into the same output. With

this form of generative deep learning,

the output layer can get information

from past future states simultaneously.

So as you can see here we have two

layers over here and as they are

birectional what happens is when the

algorithm feels that it is kind of

losing its gradients or the previous

data it can go back and get the data

from the past. So why do we need

birectional recurrent neural network?

Well, birectional recurrent neural

network duplicates RNN processing chain

so that the input process both forward

and reverse time order thus allowing

birectional recurrent neural network to

look into future context as well. The

next one is long short-term memory. Long

short-term memory or also sometime

referred to as LSTM is an artificial

recurrent neural network architecture

used in the field of deep learning.

Unlike standard feed forward neural

network, LSTM has a feedback

connections. It can not only process

single data point but also the entire

sequence of data. So as you can see here

from what I'm trying to say is with LSTM

or longsh short-term memory it has

something like you know we can feed a

longer sequence compared to what it was

with birectional RNN or RNN. So why is

LSTM better than RNN? We can say that

when we move from RNN to LSTM we are

introducing more and more control over

the sequence of the data that we can

provide. whereas LSTM gives us more

controllability and there's better

results. All right. So the next type of

recurrent neural network is the gated

recurrent neural network or also

referred to as GRUs. You see GRU is a

type of recurrent neural network that is

in certain cases is advantageous over

long short-term memory. GRU makes use of

less memory and also is faster than

LSTM. But thing is LSTMs are more

accurate while using longer data sets.

I'm sure by now you might have got a

hint about the trend that has lead to

the improvement. Right? So the trend

over here is you know the model should

be capable of remembering and taking in

on a longer input sequence. The game

changer part for the sequential data was

developed when we came up with something

called as transformers. And this paper

was something which is based on a

concept called as attention is

everything. All right. So let's take a

look at this. The paper attention is all

you need introduces a novel architecture

called as transformers. Like LSCM

transformers is an architecture for

transforming one sequence into another

while helping other two parts that is

encoders and decoders. But it differs

from previously described sequence to

sequence model because it does not work

like GRUs. Okay. So it does not

implements recurrent neural networks.

Recurrent neural network until now were

one of the best ways to capture the

timely dependence on a sequence.

However, the team presenting this paper

that is attention is all you need prove

that an architecture with only attention

mechanism does not use RNN can improve

its result in translation task and other

NLP task. One of the best examples for

transformers is Google's bird. So what

exactly is this transformer? Right? You

see here we have encoder on the top and

decoder on the bottom. Both encoder and

decoder are comprised of modules that

can stick onto the top of each other

multiple times. So what happens here is

the inputs and outputs are first

embedded into n dimension space since we

cannot use this directly. So we

obviously have to encode our inputs

whatever we are providing here. One

slight but important part of this model

is the positional encoding of different

words. Since we have no recurrent neural

network that can remember how sequence

are fed into the model. we need to

somehow give every word or part of our

sequence a relative position since a

sequence depends on the order of the

elements. Okay, these positions are

added to the embedded representation of

each words. All right, so this was the

brief about transformers. So let us now

move ahead and see some of the popular

language models that are available in

the market. All right, so let us now

start off by understanding OpenAI's

GPT3. The successor to GPT and GPT2 is

the GPT3 and is one of the most

controversial pre-trend models by

OpenAI. The large scale

transformer-based language model has

been trained on 175 billion parameters

which is 10 times more than any previous

non-sparse language model. The model has

been trained to achieve strong

performance on many NLP data set

including task like translation,

answering questions as well as several

other tasks. Then we have Google's bird.

Bird stands for birectional encoder

representations from transformers. A is

a pre-trained NLP model which is

developed by Google in 2018. With this

anyone in the world can train either

their own question answering module with

up to 30 minutes on a single cloud TPU

or few hours using single GPU. The

company then released this showcasing

the performance of 11 NLP task including

very competitive Stanford data set

questions. Unlike other language model,

bird has only been pre-trained on 250

million words of Wikipedia and 800

million words of book corpus and has

been successfully used as a pre-trained

model in deep neural network. According

to researchers, bird has achieved 93%

accuracy which has surpassed any

previous language models. Next, we have

Elmo. Elmo also known as embedding for

language model is a deep contextualized

word representation that models syntax

and semantic words as well as their

logistic context. The model developed by

Alan LLP has been pre-trained on a huge

text corpus and learn functions from

birectional models that is by LM. Elmo

can easily be added to their existing

models which drastically improves the

features of functions across vast NLP

problem including answering questions,

textual sentiment and sentiment

analysis.

[music]

Prompt engineering is an interesting

field that combines artificial

intelligence and human language

understanding. In this field,

professionals and researchers work to

create prompts or instructions that

effectively guide AI systems to produce

the expected outcome. Whether it's

fine-tuning language model, designing

prompts for specific task, or optimizing

human machine communication, prompt

engineering is crucial for leveraging

the power of AI for a variety of

applications. Imagine you are developing

a virtual assistant application using a

large language model such as GPT3. The

goal is to provide users with an

engaging and helpful experience by

designing effective prompts that

generate informative and relevant

responses from the model. So let's

consider a scenario in which the virtual

assistant assist users with the travel

planning. So here's how prompt

engineering plays a major part. So the

scenario is you're planning a trip to

Paris and want the virtual assistant to

provide recommendations for activities,

restaurants, and landmarks to visit

during your stay. So let's say you're

looking for a help with a traditional

prompt and you ask for what should I do

in Paris and virtual assistant will

assist you here like here are some

recommendations for activities in Paris

and here's how the enhanced prompt

through prompt engineering respond to

your queries. So if you input a query

that goes like hey there I'm super

excited about my upcoming trip to Paris.

So could you please recommend some must

visit places and activities for me then

the virtual assistant will generate the

response as something like this. So of

course Paris is an amazing city with so

much to offer. So here are some must

visit places and activities and it

continues with the explanation about

each place. I hope you got the idea of

how enhanced prompt provides users with

an engaging and helpful experience by

designing effective prompts that

generate informative and relevant

responses from the model. So now let us

understand what exactly is prompt

engineering. Prompt engineering is a

method used in natural language

processing that is NLP and machine

learning. It's all about crafting clear

and precise instruction to interact with

large language model like GPT3 or B. So

this models can generate humanlike

responses based on the prompts they

receive. Think of prompt engineering as

giving direction to these models. By

crafting specific and concise prompts,

we guide them to produce the response we

want. So to do this effectively, we need

to understand the capabilities of the

model and the problem we are trying to

solve. Finetuning prompts allows

researchers and developers to improve

the performance and usability of LLMS

for a variety of applications including

text generation. question answering,

language translations and others.

Effective prompts engineering

necessitates a thorough understanding of

the underlying models capabilities as

well as the problem domain and desired

result. Now let's find out why prompt

engineering matters for AI. So prompt

engineering is important in AI because

it improves model performance,

customization, and reliability. By

creating clear and tailored prompts,

developers can help AI models produce

more accurate and relevant result,

reduce biases, improve user experience,

and address ethical concerns. In simple

terms, prompt engineering ensures that

AI system produce useful and reliable

result that meets the needs of users

while adhering to ethical principles. So

now let's consider an example in the

context of text generation for

generating product description. Assume

you're using an AI model to create

product description for an online store.

So without prompt engineering, you may

issue a generic prompt such as generate

a product description for a smartphone.

So without prompt engineering, you would

get something like this. This smartphone

has a high resolution display, powerful

processor, and a longlasting battery

life. The given prompt is less effective

because it lacks specificity. So it

simply says, generate a product

description for a smartphone. So this

may make it difficult to come up with an

idea and write something engaging and

informative. So having a good prompt can

make a significant difference in your

writing. They give you a clear idea of

what you need to write about and keep

you focused and organized making it

easier to generate ideas and express

yourself. On the other hand, by using

prompt engineering techniques, you can

provide more specific instructions or

constraints that will tailor the

generated descriptions to the target

audience or brand style. So with prompt

engineering, if you input a query such

as create a product descriptions for a

budget friendly smartphone perfect for

the young professionals highlights, it's

affordable, sleek and packed with a

top-notch camera features and the

generated response would be something

like this. Introducing our sleek and

affordable smartphone design for young

professionals with its stylish design

and advanced camera features capturing

life's moments and have never been

easier and it goes on giving its key

features along with it. So through this

example we understood that prompt

engineering enables the creation of a

product description that is useful to

the target audience and highlight

specific features based on the

instructions provided. So this shows how

prompt engineering can improve the

importance and effectiveness of AI

generated content for specific

applications. To help AI models give

accurate answers, it's important to

create clear prompts. So here are some

simple rules for generating effective

prompts. First, make it clear. So

clearly explain what you want the AI to

do. Unclear prompts might confuse the AI

and lead to wrong answers. So make sure

that the prompts is clear. For example,

the unclear prompts is something like

write about cars. So where we have not

mentioned which type of car or anything

much in details whereas the clear prompt

is write a description of a red

convertible sports car. Next give

context. So provide enough information

so that AIS understands the task. So

this helps it give accurate response

that makes sense in the given situation.

So for example, prompt without context

is write a story. Prompt with context is

write a story about a girl who discovers

a magic book in her attic. Next, show

examples. Use examples to show the AI

what you are looking for. So this helps

it understand the type of response you

want. So for example, the prompt without

example is describe a beat scene. So

prompts with examples is describe a beat

scene with the palm trees, crashing

weaves, and the people playing

volleyball. Next is keep it short. So

don't overload the AI with too much

information. Short prompts help the AI

focus and give quicker, more accurate

responses. For example, long prompts are

like this. Write a detailed essay

discussing the impact of climate change

on biodiversity and ecosystems in

topical rainforest. And short prompts

look something like this. Write about

climate change effects on rainforest.

Next, avoid biases. So, make sure your

prompts are fair and don't include any

unfair assumptions. So, biased prompts

can lead to biased answers which isn't

helpful. So, for example, write about a

woman who struggles with her weight. So

unbiased prompts are right about a

person overcoming challenges. Next, set

limits. So tell the AI any rules or

restrictions it needs to follow. This

helps guide its response and ensures

they meet your specific needs. For

example, prompt without limits are write

a story and prompt with limits are write

a story set in a haunted house with a

maximum word count of 500 words. And I

hope it's very clear. Next, moving on to

some example of prompts for generating

text using chat GPT. For text generation

task, prompts usually consists of a

textual instructions or starting point

that directs the model to produce

coherent and relevant text. Prompts can

be story prompts, questions, or

incomplete sentences. Text generation

prompts provide context and directions

to the model, allowing it to generate

humanlike text responses. They influence

the generated text tone, style, and

context. So let's say the prompt is

write a short story about a character

who discovers a hidden treasure. So by

providing a specific story line and

theme in the prompt, the model is guided

to generate a coherent and engaging

narrative centered around the discovery

of a hidden treasure. So the picture

illustrate how Chad crafts stories with

the engaging touch making them more

captivating and interesting for readers.

Next question answering. So prompt is

can you describe the common signs and

symptoms of COVID 19 along with any

precautions that can be taken to stay

safe and just like that it can generate

answers to all your questions in mere

seconds. So by framing the prompt as a

question the model is directed to

provide a concise answer regarding the

symptoms of COVID 19 ensuring relevant

and informative responses. Next,

language translation. Translate the

given English sentence. The quick grown

fox jumps over the lazy dog into Spanish

while maintaining its original meaning.

So, by specifying the source and target

language in the prompt along with the

input sentence, the model is instructed

to perform a precise translation task

ensuring accurate language conversion.

Next, code auto completion using OpenAI

codeex or Chadi. you can perform code

auto completion task. So here we go with

TPD. Code generation prompts are usually

partial code snippets or descriptions of

programming task. They specify the

desired functionality or behavior that

the model should show. Code generation

prompts allow the model to generate code

that satisfy specific programming

requirements such as implementing

algorithms, defining functions, or

solving coding problems. So the prompt

is complete the following Python

function to calculate the factorial of a

number and here you have also added the

function. So by presenting an incomplete

code snippet along with clear

instructions the model is directed to

suggest appropriate code completion

helping developers write code more

efficiently. Now moving on to the text

to image generation. Image generation

prompts specify the visual sense,

objects or concept that the model should

generate. They may include textual

descriptions, keywords or images. So

image generation prompts allow the model

on what visual content to generate. They

influence the generated images,

composition, style and detail. For

example, the prompt is imagine a tree

where the branches are made of stacks of

books. So can you paint me a picture of

that? And for the given prompt, we got

the image generated as something like

this. An imaginative portrayal of a tree

with branches composed of stat books.

Eight book representing a leaf and

covers visible.

And the next prompt is picture a cloud

in the sky that looks like a huge heart.

Can you draw that for me? And here we

go. These AI tools leverage prompt

engineering techniques to generate text,

perform language translation, code auto

completion, and text to image

generation, demonstrating the

versatility and power of prompt based on

interactions with AI models. Next, why

is machine learning useful in prompt

engineering? Machine learning is very

helpful in prompt engineering especially

in linguistic and language models

because it helps create better prompts

and interactions by analyzing lots of

data and finding patterns. So first

understanding language patterns. Machine

learning algorithms can analyze large

amounts of text to understand linguistic

patterns like grammar, syntax, semantics

and context. So this understanding is

critical for developing effective

prompts that generate desired responses

from language models. Next, generating

relevant prompts. Machine learning

models can suggest or generate prompts

based on input data and user

preferences. These prompts can be

tailored to specific task, domains, or

user requirements, making them more

useful and efficient for guiding

language models. Next, optimizing

prompts design. Machine learning

techniques can be used to optimize

prompt design by comparing the

performance of various prompts and

selecting the one that produce the best

result. This iterative process improves

prompt engineering practices and the

overall performance of language models.

And the next is personalizing

interactions. Machine learning enables

personalized interaction by creating

prompts to individuals users

preferences, history and context. This

personalization increase user engagement

and satisfaction with the language model

interaction. Next, improving model

performance. Machine learning algorithms

can be used to fine-tune language models

based on prompt response pass increasing

their performance and accuracy over

time. Language model can be trained on a

variety of data set and prompts to

produce more relevant and contextually

appropriate responses. And next,

mitigating bias and misinformation.

Machine learning techniques can help

identify and mitigate preferences in

prompt engineering by examining prompt

responses pairs for potential biases or

inaccuracies.

Language models can produce more fair,

inclusive, and reliable results by

detecting and correcting for

preferences. And I hope it is clear why

machine learning is useful in prompt

engineering.

>> [music]

>> Now let us understand what langchen is

and why it is a valuable tool for

building AI applications. You must be

aware of popular applications such as

GPT and Gemini. These applications

utilize APIs and GPD uses OpenAI's API

while Gemini operates through the Gemini

API. to process prompts. They leverage

models like GPT 3.5, GPT4, Palm and

Gemini 1. Additionally, these are other

advanced models such as Llama, Gemini,

Cohair, Cloud version 1, Falcon, Palm,

GPT4, and GPT 3.5. Langchain is a

framework designed to help developers

build flexible and powerful AIdriven

applications by integrating and

utilizing these diverse models

effectively. But why exactly do we need

lang chain? You must be thinking if

langchain is this important then why do

we need lang chain? So let's break down

this question using some real world

examples. So imagine simply asking an

LLM a prompt and getting an answer.

That's easy. But what happens when the

complexity increases? For example, let's

say you're working with data from SQL

databases, CSV files, PDFs, or Google

Analytics, and you need the model to

write code, perform searches, or send

emails. Handling such intricate

workflows manually can get overwhelming.

This is where Langton steps in. It

simplifies the process by offering

components like document loaders, text

splitters, vector databases, prompt

templates, and tools. So this helps you

assemble tasks such as document

summarization, question and answer

systems or even advanced workflows like

Google searches or customer support

automation. Let's visualize this process

with a diagram. Here's how it works. So

first you load a document like a CSV

file using a document loader. Then use a

text splitter to divide it into a

smaller chunks and then store those

chunks into a vector database and add a

prompt template to guide the model. And

finally, use a LLM like a GP4 or Lama to

perform tasks like searching the web or

automating workflows. And lang chain

also offers chains that will help you

assemble components to achieve single

task such as summarization and an agenda

to figure out what each component must

do like password, customer services,

etc. Now that we understand langen's

core components, now let's explore how

it streamlines the LLM application life

cycle.

So it typically involves three key

stages. First is the development where

you build and test your application.

Then productionization where the system

is fine-tuned for a real world use. And

finally deployment where the final

product is launched for users. So Lang

simplifies this life cycle allowing you

to focus on building without worrying

about the underlying complexity. Now

let's take a step back and understand

the role of APIs in powering these LLM

applications and how Lankton effectively

integrates them. In all these

applications and models, one thing is

common that is they use API. So now

let's discuss APIs. APIs act as an

intermediaries that enable different

systems to communicate with each other.

For example, they allow apps like Swiggy

or Blinket to display your delivery

driver's location in real time. So now

let's look at the steps to explain APIs

and API keys. So apps like Zipto, Swiggy

and Blinket use APIs to show the

location of your delivery driver. So

these apps don't communicate directly

with Google Maps but follow a layer

process involving servers and security

mechanism. First the app sends a request

to Google Maps API. Then the API forward

the request to Google servers. Then the

servers validate the request with the

system. So once approved the response

follows back through the servers, APIs

and finally to the app. So previously

apps like Swiggy allowed login using

phone numbers. Now they use API for

login via platforms like Google or

Facebook. So this demonstrates the

versatility of a APIs in enabling

seamless user interactions. To prevent

misuse, APIs require API keys which are

unique identifiers for secure access. So

these keys authenticate request and

ensure that only authorized users can

interact with the APIs. Next, security

systems closely monitor API usage to

detect and prevent misuse. This ensures

that APIs remain safe and functional for

their intended purpose. And these steps

simplify the explanation of how APIs and

API keys work in real world

applications. So this is how langchain

leverages APIs to connect your LLM

applications with external tools making

them versatile and secure. Now that we

understand the role of APIs. So let's

explore some real world applications of

lang chain. So what can you build with

lang chain? Here are few applications.

First application we have is customer

support. So customer support for your

shopping websites to interact with

customers. Next, conversational chat

bots for helping you study, content

generation tools for blogs or social

media. We also have question answering

systems for knowledge bases and then

document summarizers for legal or

academic content. Lenin simplifies AI

development by integrating LLM with

various data sources and tools. Its

applications are vast from chat bots to

document summarization. So, let's

examine a practical example to see Lchin

in action.

All right. In today's datadriven world,

understanding and effectively using SQL

queries is crucial for managing and

analyzing large data sets. However,

beginners and even experienced users

often need help with complex SQL

queries, their syntax, and how they

work. This creates a barrier to

efficiently interacting with databases

and limits their potential to solve real

world problems. To address this

challenge, we propose a SQL query

fetcher application that leverages the

Gemini AI, Python, and Streamlit to

simplify SQL learning and usage. The

application allows users to input or

select a query, generates the SQL

syntax, and provides a detailed

explanation of its components and

functionality. This tool bridges the gap

between technical understanding and real

world database operations, empowering

users with an initiative and interactive

SQL learning experience. Let's jump

right into the code. So the first step

is setting up your dependencies. Here we

import streamlit for the user interface

and then Google generative AI for using

Gemini. So first import streamllet as SD

and next import

Google dot generative AI as Gen AI.

So to get this API you have to go to the

Google Gemini API key and here click on

get a Gemini API key in Google AI studio

and then once you scroll there is a

button on the left called create API.

Now click on it and select your model

here

and let's copy it. And now let's go back

to our VS code editor and paste it here.

So to paste let's type

Google

API

key and inside the double quote let's

paste it. And now let's type genai dot

configure

and inside the bracket let's keep it as

api key equal to and give it as

google_appi

key. Now let's type model equal to genai

dot generative model

and inside the bracket let's keep it as

gemini pro.

So we use the Google Gemini API to

generate SQL queries dynamically. So

make sure to configure your API keys

securely. Now let's display the streaml

layout code.

Now let's set up the app's user

interface. So we use streaml to create

an interactive page where users can

input plain English queries and get SQL

code in return. So we write st dot set

page_config

and inside the bracket let's give it as

page title

and inside the double quotes let's give

a title as edureka sql query generator

and give a comma and let's type it as

page icon equal equal to and inside the

double quote let's keep it as robot.

Now let's put some images. So I'm using

Edurea image and SQL logo and also to

make them center we will type it as

column 1 comma column 2 comma column 3

equal to st dot columns

and inside the bracket let's keep it as

1a 2a 1. Next, let us type width column

2 colon

and let's type it as st dot image and

inside the bracket let's give the image

address and then width is equal to 200.

Now let us add another image. So let's

copy the same and give the other image

address.

Our layout includes a title, logo and

text input box to keep the interface

simple and initiative. So here's where

the magic happens. So when a user clicks

the generate SQL query button, we format

their input into a prompt for the Gemini

model to generate SQL code. So let's

create template by writing template

equal to and inside the triple quotes,

let's type it as create uh SQL query

snippet using the below text.

Next, let us also give text input and

we'll also type I just want a SQL query.

Now let's type the response. So type

response equal to model dot generate

generate

content

and let's keep it as template dot format

and inside the bracket let's give

text_input

equal to text_input

and next let's type the SQL query. So

give SQL query

equal to response

dot text

dot. So let's give the strip function

dot lstrip function dot r strip

function. So the AI generates the SQL

query and we clean up the output for

display. So once the SQL query is ready,

we take it a step further by generating

a sample expected output and a clear

explanation of the query. Now let's type

the logic for showing explanation and

output. So let's type st dot markdown

and inside the bracket we will give HTML

tags. So first div style is equal to we

will align the text and center. So text

align center and next let's give H1 tag

and inside H1 tag we will write SQL

query generator

and let's close the H1 tag. Next, let's

open H3 tag and write I can generate SQL

queries for you. And let's close the H3

tag. And inside the H4 tag, let's type

it as with explanation as well.

Now close the H4 tag and let's open the

paragraph tag which is the P tag. And

inside the P tag, let's type it as this

tool allows you to generate

SQL queries based on your data. Now let

us close the P tag. Also close the div

tag.

Now to make the markdown visible, let us

type unsafe

allow

HTML equal to true. Now let's write text

input equal to st.ext

area and inside the bracket let's give

it as enter your query here in plain

English. Now let us give a submit

button. So for that let us type submit

button equal to st dot

button and inside the bracket let us

give generate SQL query. Now if submit

button colon write it with st. dot

spinner and inside the bracket let's

keep generating SQL query

and then let's create a template and

inside the trible quotes let's type it

as

create a SQL query snippet using the

below text.

Now using the about template we will

write three more templates for SQL query

which are text input and then SQL query

which will include expected output and

also explanation output.

Now to merge all the templates together

we will make a container. So we will

write with st.container.

So it's a function and let us also write

it as st dot

success

and inside the bracket let us give it as

SQL query generated successfully

also we will give here is your query

below next

st code

and inside the bracket let us give SQL

query Okay. And comma language equal to

SQL.

Now let us give once again st dots

success and inside this let us keep it

as

expected output

of this query will be

and now let us keep it as st dot

markdown and inside the bracket a output

once again st dots success and inside

this let us keep it as explanation of

this SQL query.

Next let us give st dot markdown

and inside this function let us keep it

as explanation. So over here this shows

a green success message indicating the

SQL query was generated successfully and

next the show SQL query. This displays

the SQL query as a formatted code block

highlighting it as a SQL. Next is the

display expected output. So this

provides a success message for the

query's expected output followed by st

markdown e output which displays the

expected output in markdown format and

followed by st domarkdown e output which

displays the expected output in the

markdown format. So now this line of

code introduce an explanation and st.m

dot markdown explanation displays that

it is in markdown format for clarity. So

this makes the tool valuable for both

learning and debugging SQL. So now let's

see it in action. So open the terminal

and let us type streamlit run and give

your file name.

Now as you can see the screen your SQL

query generator is ready to go. Now

let's test it. So for that here I will

input a prompt asking for a query which

is give me the query for create table.

Now let's click on generate SQL query

and as you can see it's running. So

let's wait for it to generate.

So as you can see on the screen the app

generates a SQL query expected output

and even a plain English explanation in

seconds. So how cool is that right? And

that's it. Our SQL query generator

powered by line chain Gemini API and

streamlate is complete. So this project

is perfect for simplifying SQL learning

and enhancing productivity.

[music]

So before we talk about agents, let's

quickly understand lang. Langchain is a

framework designed to help you connect

large language models such as GPT with

external tools, APIs, memory, and custom

logic. Normally, LLM like chat GPT can

only generate responses based on the

text you give them. But what if you

wanted to search the web, run Python

code, query a database, or use a

calculator? That's where Lantern comes

in. It acts as a bridge between the LLM

and the tools it can use to interact

with the real world. And one of the most

powerful features in lang chain is

agents. So what exactly is a lang agent?

So think of it like this. Instead of you

telling the AI exactly what to do, you

just give it a goal and the agent

figures out how to get it done. An agent

combines the power of reasoning,

decision making, and tools. It uses the

LLM to understand the task, choose which

tools it needs, call those tools in the

right order, and then return the final

result to the user. It's like giving

your AI assistant a toolbox and letting

it decide which tools to use based on

the question you ask. So, let's look at

a real example to make it clear. Imagine

this prompt. Check the current stock

price of Apple and calculate the average

over the past 5 days. A regular chatbot

can't do that. But a langin agent can

use a web search API to find today's

stock price and use a Python tool to

calculate the average and then respond

with the results all automatically. So

here's what happening under the hood.

The LLM receives your prompt and it

decides it needs to search and calculate

and then it picks the right tools maybe

SER API for search and Python for math

and it performs each step in a sequence

and gives you the final output. So this

is all done dynamically meaning you

don't hardcode each step. So the agent

figures it out using the language models

reasoning. Now let's look at the inner

working of a lang agent. So when you

create an agent in blankchain, you

define three things. First the LLM to

use like GPT4 or CL. Next the tools

available like calculator, web search,

database query etc. Next the agent type.

So the lang supports types like zeros

agent and conversational agent. So now

that you understand how lang agents

work. So let's quickly talk about the

two most popular types. So first we have

zero short agent. So this is the most

commonly used agent. It works by giving

the language model a list of tools along

with a description of what each tool

does. Then the model uses that

information to figure out on the fly

which tool to use and in what order. So

it's called zeros because the model

doesn't get examples. It just reasons

based on the tool descriptions. And it

is best for task that doesn't need

memory or a back and forth conversation

just like data lookups, calculations or

API calls. And the next type is the

conversational agent. This one is more

advanced. It is designed for multi-turn

conversations. That means the agent

remembers previous steps and keeps track

of what's already been done. So it uses

a chat history and a memory module to

maintain context across multiple

prompts. And it is best for chat bots,

virtual assistants or tools where the

user ask follow-up questions or expect

the AI to remember context. So in short

I can say that the zeros short agent is

fast, simple and oneshot task. Whereas

the conversational agent is contextaware

back and forth dialogues and there are

other agents like tool using agents,

plan and execute agents or multi-action

agents for more advanced workflows and

perfect for the future deep dives. Now

to understand lang chain agents let's

quickly explore its core building

blocks. So first we have the LLMs. It is

a brain of the system. Lang chain

supports modules like openAI GPT 3.5 or

GPT4 also anthropic alert and hugging

face Olama cohire etc. And the next

component is prompts these are the

templates that guide the LLM's behavior.

You can use static prompts or chat

prompt template for more dynamic and

multi-turn interactions.

Next we have chains. It is a sequence of

calls or logic. Next tools. These are

the external functions the LLM can call

such as Python calculator, web search

API or SQL query executor. Then we have

agents. Agents dynamically decide which

tool to use and when based on your

input. So agents are what lang chain

from a chatbot into a multi-tool problem

solver. So over here the tools are just

Python functions wrapped in a lang

format. For example, this is the Python

function. So, it's like giving your AI

assistant a toolbox and letting it

decide what to use based on your prompt.

Next, the agent then follows a process

called react, which stands for reasoning

plus acting. So, here's what that looks

like. So, first, the agent receives the

prompt. Then the LLM decides that I need

to search the web. So, the lantern calls

the search tool and then the tool

returns the result. Next lm reasons now

I need to do a calculation. So the

lantern calls the calculator tool and

agent returns the final answer. So here

is the example of the react loop. So the

thought is I need to find today's

weather. The action that takes is it

uses a weather API. Next is the

observation. For example, it's a 28°C in

Bangalore. The thought is now I can tell

the user the temperature. And the final

answer would be it's currently 28°C in

Bangalore. So this entire flow is

written and passed by the LLM itself

using intermediate steps called scratch

packs. So the lang passes those steps

and knows when to call a tool or stop.

Now let's look at where Linen agents are

used in real world projects. So first it

is used in AI customer assistance. So

the agents can look up user info, reset

passwords, and respond to queries

automatically. So the users can ask

things like what was my profit margin

last quarter. So the agent pulls data

from a database, does the math and

explains it. Next, the langin agents can

be used in research tools. So you can

build a research bot that searches

multiple sources, summarizes and gives

you an answer step by step. then in

automated workflows like send a message,

create a task in a Trello and update the

CRM all with one prompt. So that's the

power of Langchain agents. So they allow

your language models to take action, use

tools and solve real world tasks step by

step. So let me know in comments if you

want a full coding tutorial on building

your first Lang agent.

Rack is a hybrid approach in artificial

intelligence that combines retrieval

systems with generative models to

produce highly accurate contextually

relevant responses. It brings the gap

between fractual accuracy and natural

language generation. Now let's

understand it with the help of diagram.

So it's a hybrid approach involving

artificial intelligence that combines a

retrieval system with a generative

system to produce highly accurate

responses. Now that we know what rag is,

so let's explore why it is crucial for

large language models and see a real

world example. So rag addresses several

limitations of traditional LLMs. It

mitigates illusions by grounding

responses in fractual retrieve data by

dynamically accessing up-to-date

information. Rack stays relevant in

rapidly challenging domains. It improves

accuracy and relevance by fetching

specific relevant documents during

inference. By outsourcing factual

knowledge retrieval, rag enables

smaller, more efficient models and it

can adapt to domain specific knowledge

basis for specialized applications.

Additionally, rack provides

explanability by showing the retrieved

documents or data sources increasing

trust and transparency. Now let us see

some of the use cases.

So without rag the sentence would be

when was the last Mars rover launched?

So this is just the incorrect response.

So with rack the sentence would be

dynamically retrieved from NASA's

database and it would be the

perseverance rover was launched on July

30, 2020. Now that we have seen why rag

is important. So let's dive into how it

works. Well, rag operates in three-step

process. A user submits a query which

triggers the retrieval stage. Here a

retriever searches a database or

knowledge base using tools like DM25 to

fetch the most relevant information. The

retrieve data is then fed into a

generative model like GPT or T5 which

process it and generates a coherent

contextually grounded natural language

response. Now let's take an example

here. The query is who wrote 1984.

Retrieve would be fetching a document

containing George Orwell wrote 1984. Now

generative response would be the author

of 1984 is George Orwell. This hybrid

approach makes DAG ideal for real world

applications like chat bots and

knowledge systems. Now that we

understand how rag works, let's explore

some of its real world applications.

Rack's versatile applications span

various domains. In knowledge

management, it can summarize large

databases or documentation, aiding

corporate teams. Legal and compliance

task benefits from RA's ability to

answer queries based on case law and

regulations. While in healthcare, it can

support medical professionals by

summarizing research papers and

guidelines. Education and e-learning can

leverage rack for virtual tutotoring

providing detailed explanations based on

the textbooks and research papers.

Interactive virtual assistants like

Alexa and Siri can utilize rack to

generate accurate and informative

responses to user queries such as news

headlines or product recommendations.

Rack's unique ability to combine

retrieval and generation makes it

essential for task demanding both

factual accuracy and fluent natural

language responses. Now let's compare

retrieval augmented generation with

traditional AI model across three

features. First we have fractual

accuracy. Rack provides highly accurate

responses by using realtime data whereas

traditional models may give less

accurate answers and may give errors.

Next is the context adaptability. So

here drag adapts quickly to new queries

using live data whereas traditional

models offer fixed answers based only on

pre-trained knowledge. Next we have

knowledge updates. Rank is easy to

update just change its data source.

Whereas traditional models need

retraining which takes time. Then we

have scalability. Whereas traditional

models are limited by day size and

training data. And then we have use

cases. Rag is great for task like legal

advice or customer support. Whereas

traditional models work well for

creative rating or casual queries. So

here I want to conclude that rag is

ideal for knowledge based task needing

accuracy and flexibility while

traditional models are better for

creative users. While rag offers

significant advantages, it's essential

to acknowledge its limitations. So let's

discuss the challenges and future of

rack. So the first challenge is the

latency. RACK systems can suffer from

latency issues especially when dealing

with large data sets or complex queries.

Next is the data quality dependency. The

quality of the generated responses

heavily depends on the quality of the

underlying data. The next challenge is

complex integration. Integrating rack

systems with existing applications and

infrastructure can be challenging due to

the need for data synchronization, query

optimization and model management. And

finally, scalability issues. As RA

systems becomes more complex and are

deployed at scale, they can face

scalability issues. This includes

handling increased query loads,

maintaining data freshness, and ensuring

model performance. Now, while rack faces

limitations, its potential is

undeniable. So now let's discuss Rag's

future. The future of Rag holds immense

potential. It will power dynamic

real-time applications like new

summarization, financial analytics, and

live sports commentary. Rag will be

customized for specific domains like

healthcare, law, and science through

integrations with specialized knowledge

bases. Advances in retrieval models and

compression techniques will reduce

latency to enhance efficiency. RAG will

expand to handle multimodel data

enabling use cases like multimedia

question answering. Additionally, RAG

will facilitate personalized AI

assistant and improve transparency and

explanability by attributing sources and

providing clear explanations.

Now let us move on to generative AI

project using rack. So imagine you're

working with a massive library of

documents. You need a way to quickly

search and answer question based on the

content. So manually flipping through

pages takes time and effort. Wouldn't it

be great to have a system that retrieves

relevant information and answers your

questions directly within those

documents? So that's where our Streamlit

app comes in. This app utilizes the

power of natural language processing and

advanced retrieval techniques to turn

your complex document collections into a

powerful question and answer system. So

let's take a look at the code behind

this app. This app will allow users to

ask questions about a collection of PDFs

and get answers directly from the

documents using the power of natural

language processing. Now, first let's

create a virtual environment. Now, in

the terminal, let's type the command for

setting up the environment in your

editor. For that, let's type create

p. Let's type v E Nv Python and its

version give equal equal to 3.10

- Y. Okay, let's enter.

In this command, the hyphen PB env

specifies the path and the environment

name while hyphen Y skips the prompts

for a smoother install. Now, while

that's setting up, let's create a few

essential files. So let's activate your

new environment with the command cond

activate

v e n v and forward slash.

So as you can see our environment is

ready. Now let's import libraries.

So let's start by importing the

libraries we will need. In the first

line we will import stream as st.

This gives us access to all the

functionalities of the stream lead for

building our web app interface. So next

we will import OS for various operating

systems functionalities.

After that now we will import libraries

from langj which is a framework for

building NLP pipelines. So we will use

these for task like text splitting for

document chain creation, prompting,

retrieval and more. So we will explain

each library in detail as we use them.

So let's type from langchen_group

import chat group that is gr

and next we will type from

langin.extplitter

import

recursive character text splitter. So

let's type recursive character text

splitter. Again let us type from

langchain.tchains

dot combine documents

import

create start documents chain again from

lang chain code.prompts prompts. Let's

import

create retrieval chain. Next, import f

import fis

from lankchain community dot vector

stores. So this will help us to create a

vector index for efficient document

retrieval. So let us type from lchain

community.vector Vector stores import F

ASS.

Similar imports will follow for other

functionalities like document loading

and generating embedding. But we will

introduce them as they appear in the

code. But before this go to the group

cloud website and on your left you have

API key option. So select and create

your API key

and copy this. And if you want to check

your model then go to the playground and

at the top right corner click on the

llama model and check there are so many

of them latest also. So choose your

model and generate your free API. Now go

to the terminal and paste it inv file

using variable group API key.

Now again go to the Gemini AI studio. On

your right you have the create API

option. So select your model and create

your API key. Now

now copy the key and paste it into your

environment variable that is the env

file using a variable Google

API key and paste it here.

Now we will load environment variables

from a env file that will securely store

our API keys. So for that use env to

achieve this. Let's type from env import

load

env. Also let's type load env function.

Next we use OS to retrieve the gro API

key and Google API key from the

environment variables do get env. So let

us type

gr

api_key

equal to os dot get env and inside the

function let's type it as gr oq api key

and inside the single code let us type

gr oq api key

and in the next line let us type o dot

env i n and and inside the bracket under

double quotes let us type Google

API key and equal to OS dot get env

let us type google API key. So here

these keys are the required to use

specific NLP services. Now let us write

code for displaying app title and

images. So for that load your image.

Since I'm using edureka image name

edureka.png along with the app title

edureka document question and answer we

will use st. image and st.title for this

purpose. So for that let us type

stimage.

So inside the double quotes let us keep

the image name and comma width is equal

to 200. And let us also keep the title.

So for that st.itle

and let us type edure document question

and answers. Now the next step is to

initialize chat group and prompt

template. Now it's time to interact with

the lang chain group API. So initialize

the chat group object using group API

key and specify the llama 38b 8192 which

is the language model we will be using

for our NLP task. So for that let us

type lm equal to chat group and inside

the bracket give the group API key equal

to group

API

key comma we will give the model name as

well. So for that type model name equal

to and inside the double quotes give the

model name. So here we are using llama 3

- 8b - 819.

All right. Now let us define a prompt

template using chat prompt template. So

this template ensures that AI responses

are based on the context provided and

user questions. So keeping answers

accurate and concise. So for that we

will type prompt equal to chat prompt

template dot from template and inside

the bracket let us paste the prompt. So

here we have the prompt which says

please answer the question strictly

based on the provided context. Also

ensure the response is accurate, concise

and directly addresses the question. Now

let's create function for embedding

vectors. For that let's define vector

embedding function. So type def vector

embedding function and give colon.

Next in the next line give if and under

the double quotes give vectors not in st

dot session state

colon then type st dot session state dot

embeddings equal to Google generative AI

embeddings

give equal to Google generative AI

embeddings

AI caps and inside In the bracket give

the model equal to and inside the double

quotes let us type models forward slash

embedding -001.

Now make a folder where you will load

your PDF. So I am creating ed PDF and

paste your PDF here. Now set the

session. So that is let us type st.

session state.loader loader equal to py

pdf

directory loader. Here inside the double

quotes let us paste the path of the pdf.

Next is the data ingestion. For that let

us type sd dot session_state

do.d docs

equal to sd dot session

state

dot loader dot load function.

So this particular line of code is for

data injection and here this particular

line is for document loading. Next let

us type st dot session

state dot texts spplitter

and give equal to recursive

character text splitter and inside the

bracket let us give chunk size and

mention the size here I'll give equal

to,000 and comma chunk_lap

is equal to 200.

Now here these are for the chunk

creation.

Now let us type ST dot session

state dotfal documents

equal to st dot session state dot

textsplitter

dotsplit

documents. Now inside a bracket let us

give stession

state dot docs. Let us give bracket

colon 20. So this line of code is for

splitting. Now let us type st dot

session

state dot vectors

equal to fis

dot from documents

and inside the bracket let us again type

st. dot session state dotfal documents

comma type st dot session

state dot embeddings.

Okay. So this line of code is for vector

openi embeddings. Now to input field for

question let us type prompt one. So give

prompt one equal to st.ext_imp text_imp

import and let us type here enter your

questions from any document.

Now to create a button to load

embeddings let us type if st dotbutton

and inside the function let us give

under the double quotes load edure db

give colon and in the next line let us

type vector embedding function next type

dot success

and the message would be edureka db is

ready for queries

if the question is asked then if prompt

one is true then Type document chain

equal to create_star

document chain and inside the bracket

let us give lm comma prompt and in the

next line to retrieve let us type

retriever equal to

st dot session

state dot vectors

retriever function and in the next line

let us type retrieval_chain

equal to create retrieval_chain

and inside the bracket let us give

retriever document chain. Now to measure

response time let us type start equal to

time dot

process

time function. So for response type

response equal to retrieval_chain

dot invoke and inside the bracket give

input prompt one

and in the next line let us type

response time equal to time dotprocess

time function

start. Next let us write code to display

the response. So for that let us type st

dot markdown and inside the bracket let

us keep it as AI response.

Now in the next line let us type st dots

success

and inside the bracket give response

and give answer inside the single quotes

and in the next line write st dot write.

Let us type f inside the bracket and and

inside the flower bracket let us type as

response time colon 2f and seconds. Now

moving on let us write the code to

display similar documents in an

expander. So for that let us type with

st.expander

inside the bracket under double quotes

type document similarity search results

and give colon. And in the next line let

us type st dot.m markdown and inside the

bracket let us type it as below are the

most relevant document chunks. So type

below are the most relevant document

chunks. Give colon inside the bracket.

Close the double quotes and come to next

line. Here let us type for i comm, dot

in en in enumerate and inside the

function give response dot get context.

Now in the next line let us type st dot

markdown

and inside the bracket keep f and let us

give the html tag which is div class is

equal to card and open the p tag and

inside the p tag let us keep doc dot

page content and now close the p tag.

Now let us close the div tag as well.

Now let us come outside the triple code

and give comma and type unsaved_all

allow

html equal to true.

So you can also add inline styles and

HTML tags and also icons and emojis to

make your application fabulous for the

user. Now it's time for testing. For

that open your terminal and write

streamlit run and give your file name.

So once you enter and there we go here's

our document question and answer loader.

Now select the question from the PDF you

have loaded in the file and ask your

loader.

So as you can see this is my PDF. So I'm

going to copy some question from here.

So let me just copy this. Okay once

copied. So I'm going to paste it here.

So I'm going to click on the load eda

DB. So guys as you can see it provides

an answer in context given in the PDF.

So this is our answer that it has

generated. So that's all we have used

simple Python code and languin

techniques of rack and some inline HTML

and styles.

[music]

Have you ever wondered how massive AI

models like chat GPD are managed and

optimized? That's where LLM ops, which

stands for large language model

operations, comes in. LLM ops is a key

to training, deploying, and scaling

large AI models efficiently while

keeping cost low and performance high.

It ensures faster responses, ethical AI,

and seamless integration into real world

applications. Large language model

operations is a set of practices, tools

and frameworks designed to efficiently

manage, deploy and maintain large

language models like chat GPD, cler and

gemini in real world applications. Just

like MLOps streamlines the development

of machine learning models, LLM ops

optimizes the life cycle of LLMs from

data processing and training to

deployment and monitoring. Now that you

know what LLM ops is, so let's explore

why it's important.

As LLMs become widely integrated into

business applications, customers support

chat bots, content generation tools, and

automation systems. They need to be

continuously monitored and optimized.

Without proper LLM ops practices, large

language models can become inefficient,

leading to slower response times and

increased computational costs. So, they

may also become unreliable, generating

outdated or biased outputs that impact

user trust and decision making.

Additionally, these models can be

difficult to scale and struggle to

handle increasingly user demand which

can result in performance bottlenecks

and degraded user experience. For

example, imagine running strd like AI on

a customer support chatbot. Without LLM

ops, responses would be slow,

repetitive, and expensive. LLM ops

optimized the entire workflow. So now

that we understand why LLM ops is

important, so let's take a look at how

it differ from MLOps and what makes it

unique. All right. So LLM ops is a

specialized branch of MLOps, but it is

tailored for large scale language models

rather than traditional machine learning

models. So here are the key differences

between LLM ops and MLOps.

LLM ops differs from MLOps in several

key aspects. So in terms of data

complexity, LLM ops require vast amounts

of diverse text data whereas MLOps

typically works with structured or

tabular data. Next, compute power is

another major differences as training

LLM depends high performance GPUs and

massive cloud resources while

traditional ML models generally require

lower compute power. And when it comes

to real-time processing, LLM ops

necessitate scalable deployment to

handle continuous inference.

efficiently, whereas MLOps often relies

on batch processing or periodic

inferences. Lastly, ethical and bias

considerations are more prominent in LRM

ops, requiring constant monitoring to

detect and mitigate biases and

misleading outputs. Whereas bias

monitoring in MLOps is important but

generally less complex compared to LLMs.

Next, let us see how LLM ops works. So,

LLM ops follows a structured workflow to

ensure the efficient management of large

language models. So, it begins with data

collection and pre-processing where

large text data sets are cleaned and

structured for training. Next, model

training and fine-tuning help the AI

learn to understand and generate text

effectively. Once trained, the model

moves to deployment where it is run on

cloud servers, edge devices or APIs for

real world applications. And during

inferences and optimization, the model's

response speed is improved while

minimizing computational cost. Next,

monitoring and feedback loops play a

crucial role in tracking performance and

making adjustment based on real world

usage. Finally, continuous improvement

ensures the model remains relevant by

updating it periodically with fresh

data. And here are some real world

examples. Companies like Open AI, Google

and Meta use LLM ops to maintain their

AI products without frequent manual

retraining. So now that we understand

how LLM ops works, so let's explore some

of the popular tools and frameworks that

make it possible to manage and optimize

large language models efficiently. LLM

ops professionally rely on specialized

tools to manage the model life cycle

efficiently. So one of the top three

most popular platforms is hugging face

an open-source tool for NLP and

transformer models. ML flows is widely

used for tracking experiments model

versions and training metrics. While

CubeFlow provides a scalable MLOps

framework for deploying AI in Kubernetes

and companies use a combination of these

tools to streamline their LLM ops

pipelines and ensure smooth deployment.

Now that we have covered the tools and

frameworks used in LLM ops, next let's

explore the career opportunities and the

future prospects in this rapidly growing

field. LLM ops is a rapidly growing

field with a high demand for skilled

professionals. A machine learning

engineer focuses on designing and

optimizing LLM models, ensuring their

frequency and effectiveness. An AI

product manager oversees AI model

deployment for businesses, ensuring

smooth integration into real world

applications. The role of LLM ops

engineer involves managing AI

infrastructure and scaling models for

optimal performance. And if you have a

background in machine learning, cloud

computing or DevOps, transitioning into

LLM ops is a great move.

So, LM Ops plays a crucial role in

managing large AI models efficiently,

ensuring optimal performance, reduce

cost and ethical AI development. By

leveraging top tools like hugging face,

ML flow and cube flow, professionals can

streamline model training, deployment,

and monitoring. And with the increasing

adaption of AI across industries, career

opportunities in LLM ops are booming,

making it an exciting and rewarding

field for AI enthusiast looking to build

a future in artificial intelligence. And

what do you think about LLM ops? Drop

your answers in the comments.

[music]

It seems like everywhere you look today,

businesses are turning to AI agents to

automate complex workflows, boost

productivity, and make smarter decisions

across industries like finance,

healthcare, retail, and tech. More and

more companies are already using AI

agents to handle tasks that once

required entire teams of people. Maybe

you're here because you have heard the

buzz around agentic AI and want to

finally understand what it actually

means. Or maybe you're curious about

which framework is best to build your

own intelligence systems. And you might

be wondering what exactly is an agentic

AI framework? How is it different from

traditional AI tools? And how can you

use it to build autonomous AI systems

that don't just respond but actually

think, plan, and act. In this video, we

are going to break it all down step by

step. From understanding what AI agents

really are to exploring popular

frameworks, their key features, how to

choose the right one, and where they are

used in real world. Whether you're a

developer looking to build powerful

agents or a business leader exploring

automation, by the end of this video,

you will have a clear road map to

agentic AI frameworks. Now let's take a

quick look at how AI agents are making

real impact. Starting with a powerful

example in customer experience. Imagine

reaching out to customer support and

getting instant personalized help. No

repeating details. That's what agentic

AI brings to customer experience. With

rising customer expectations and growing

burnout among support teams, AI agents

are stepping in to make every

interaction smarter and faster. These

agents don't just respond, they learn,

remember, and act. Unlike traditional

chatbots that follow scripts, agentic AI

understand context, predicts needs, and

can even take proactive action like

offering a refund, opening a support

ticket, or escalating an issue before it

becomes a complaint. They use natural

language processing to hold real

conversations, sentiment analysis to

sense emotions and can smoothly hand

over complex cases to human agent when

needed. And behind the scenes, these

agents assist customer service

representatives too fetching data,

troubleshooting problems and suggesting

solutions in real time because they can

interact with multiple system and

remember customer details. Agentic AI

delivers supports that not just quick,

it's deeply personal and proactive. And

the result is happier customers, reduce

workload for agents, and improve

efficiency for businesses. And all

powered by intelligent evolving AI

systems. When most people hear AI agent,

they picture a simple chatbot. But real

AI agent go far beyond just replying to

questions. An AI agent is a system that

can understand a goal, plan how to

achieve it, act on that plan, and learn

from the results without constant manual

input. Let me break this down. First,

the agent understands the task. For

example, if you say, "Give me a daily

sales report," it knows it needs

numbers, trends, and summaries. Next, it

creates a plan like pulling data from

your CRM, cleaning it, and generating

insights. Then, it connects to external

tools such as APIs, databases, web

searches, or even other AI agents to

gather what it needs. After that, it

executes the plan step by step. And

finally, it learns from the feedback,

storing that experience in memory to

perform better next time. So, that's the

real power of AI agents. They don't just

react, they reason, act and improve over

time. Now, here's the challenge.

Building such a capable agent from

scratch is not easy. You would have to

design its architecture, handle

communication between components, manage

memory, integrate external tools, and

make sure everything runs smoothly.

That's a lot of time, effort, and

maintenance. Agelic frameworks solve

this problem by giving developers and

organizations a readymade structure to

build agents like how a game engine

saves developers from writing the entire

physics of a game from scratch. So think

of it like this. An AI agent is like a

skilled driver. An agentic framework is

like the road system providing

direction, structure, rules and smooth

connections between destination. So

without a framework, your agent can

work, but it's slow, fragile, and hard

to scale. With a solid framework, it can

move fast, connect to multiple tools,

and collaborate with other agents

effectively. So I hope now it's clear to

you all why we need agentic frameworks.

Now, a good agentic framework isn't just

a tool. It's a complete environment that

handles the heavy lifting behind the

scenes. And here are some of the most

important features. First is the defined

architecture, a clear blueprint for how

the agent plans, decides and interacts.

Next comes the communication layer. It

enables smooth interaction with APIs,

databases, humans, and other agents.

Next is the task management. It handles

multiplestep task, priorities, and

dependencies. Next comes the tool

integration. Pre-built connectors make

setup faster and easier. Next is memory

and learning. It allows agents to

remember past interaction and improve

over time. And finally is the monitoring

and control. It gives visibility into

what the agent is doing making it easier

to debug and optimize. So in short, I

can say that the framework gives the

foundation so you can focus on building

intelligence instead of worrying about

infrastructure. So why are these

frameworks gaining so much attention?

because they allow businesses to scale

AI beyond single isolated task. With

agentic frameworks, organizations can

build multiple agents that collaborate

on complex workflows, automate entire

processes end to end, keep systems

stable and consistent as they grow, and

deploy solutions faster since the core

infrastructure is already in place. So,

this is what turns a simple chatbot

experiment into a real AIdriven

operation. Now let's look at some of the

most popular agentic frameworks today.

So first on the list we have lang chain.

It's ideal for connecting language

models with external tools and creating

multi-step reasoning pipelines. Lang

chain is an open-source framework that

helps developers build application using

large language models like GPT clar or

llama. These apps can think, use tools

and remember context. It connects LLN to

things like APIs, databases, documents,

computational tools, and memory system,

letting them follow multi-step logic.

Basically, Lanchin turns an LLM into an

intelligent agent that can perceive,

plan, and act. With LChain, you can

build agents fast and your way. Use

readymade templates and patterns like

React to create agents in minutes. swap

models, tools or databases easily with

over a thousand integrations. You can

also customize agents with a simple

middleware for approvals, conversational

management for handling sensitive data.

And with Lang Graphs durable runtime,

your agents get persistent checkpoints

and human in loop support automatically.

So with that said, next we have

Langraph. It is powerful for designing

complex workflows where multiple agents

interact. Keep your agents on track with

human in the loop checks and easy

moderation. You can guide, approve, and

control what your agents do. Langraph

makes it simple to build customizable

workflows, whether it's a single agent,

multiple agents, or hierarchical setup.

It also remembers conversations for

richer long-term interaction. Plus, with

real-time streaming, users can see what

the agent is thinking and doing, making

the experience smooth and interactive.

The next framework on our list is

Autogen. Great for building multi- aent

systems that collaborate like a team.

So, Autogen is an open-source framework

by Microsoft that lets you create AI

agents that can collaborate not just

with humans, but also with other AIs.

These agents can chat, reason, plan,

write code, use tools, and even review

each other's work to finish complex task

automatically. Autogen works like a

team. The user proxy agent represents

the user. It gives instructions or goals

and the assistant agents plan, reason,

and execute task. And a critic agent can

review the output and suggest

improvements. So these agents

communicate through messages just like

people chatting in a group project

discussing steps until they reach the

right answer. Autogen makes it easy to

build multi- aent systems for research

automation, software development, data

analysis, content generation and it

saves time, reduces human effort and

improve quality through self-re and

collaboration. So you can create an

autogen setup where one agent writes

code, another test and third reviews it

all automatically. Autogen turns AI

models into autonomous collaborators

that can think, talk and work together.

It's like giving your AI teammates each

with their own role to solve problems

faster and smarter. Next on our list, we

have Crew AI. It focuses on

orchestrating specialized agents to work

together on complex task. Creo AI makes

it easy to build and manage

collaborative AI agents that can handle

complex task on their own and at scale.

It's easy, trusted, and scalable,

helping businesses adapt AI across teams

with centralized management and

monitoring. You get LLM and tool

configuration, role-based access and

serverless containers. Next, moving on

to open dating built for developer

agents. It enables coding AIs that can

write, test, and deploy code

independently. Open Daven is an

open-source project aimed at creating an

autonomous software engineer, an AI

agent that can understand software task,

write code, debug, test, and deploy

solutions automatically. Open Daving

uses large language models combined with

specialized tools and environments to

perform end-to-end software development

task. It can understand developer

instructions, plan coding step, write

and modify code, run and test the

scripts, fix bugs and deploy solutions.

Basically, it acts as an AI pair

programmer or even a full autonomous

coding agent. Open Davin is part of the

new wave of agentic AI frameworks

systems that can act, learn and

collaborate. It helps developers

automate repetitive coding task, debug

faster, build prototypes independently

and finally improve productivity. Now

let us move on to next framework. We

have semantic kernel. A lightweight and

flexible framework for integrating

external skills easily. Semantic kernel

is an open-source toolkit that makes it

easy to build AI agents and connect the

latest AI models to your C, Python, or

Java projects. It works like smart

middleware, turning model requests into

function calls and sending results back

fast. You can plug in your existing code

as extensions, integrate AI services

easily, and share them across your team.

It's modular, flexible, and built to

rapid enterprise solutions. The next

framework on our list is Llama Index. It

is perfect when your agent need to work

with structured data. Llama index is an

open-source data framework that helps

developers connect large language models

like GPT or Llama to external data

sources such as databases, PDFs,

documents, APIs or websites. Lens are

great at reasoning and generating

language but they don't naturally have

access to your private data like

internal documents, business reports or

real-time information. Lama index acts

as a bridge between LLM and your data.

It ingests data from any source like

text files, PDFs, SQL databases, APIs,

notion, etc. Indexes that data

efficiently for retrieval. feeds the

most relevant information back to the

LLM when you ask a question. So this

process is known as rack which is

retrieval augmented generation. So each

framework brings something unique. Some

emphasizes tool integration other

collaboration or data handling. The

right choice depends on your use case

and goals.

So there is no single best framework

only the best one for your specific

needs. So, here's a simple way to choose

the right agentic framework. First,

start with your goal. Are you building a

chatbot, an autonomous flow, or a multi-

aent ecosystem? Next, check integration

needs. Can it connect easily to your

tools and data sources? Then, consider

scalability. Think, will you need more

agents later? Next, look for

flexibility. Can you customize how your

agents reason and act? And then evaluate

community support. Good documentation

and an active community save time. Then

balance cost and performance. Choose

something that fits your resources and

growth plans. So your framework should

align with the problem you want to solve

and not the other way around. So there

is a common confusion for beginners. AI

agent builders are like a readyto use

kits. You can drag, drop, and launch an

agent fast, which is perfect for simple

task like a support bot, but they are

limited in flexibility. Agentic

frameworks, on the other hand, give

developers the freedom to design

powerful customized systems from the

ground up. So, it's just that the

builders are like instant cake mix, fast

but limited. Frameworks are like having

all the ingredients. It takes more

effort, but you control the flavor,

shape, and result.

So agentic frameworks are becoming the

backbone of modern AI automation. They

make it possible to build intelligent

autonomous agents that don't just

respond but plan, act and learn. As

agentic AI continues to grow, this

frameworks will play a key role in

shaping the future of automation. If

this video helped you understand the

concepts clearly, don't forget to like,

share, and subscribe for more deep dives

on AI agentic systems and genai tools.

>> [music]

>> Let's imagine it's 8:30 in the morning

and you're having your first cup of

coffee. While you're getting ready, your

personal AI assistant has already

rescheduled your meeting through your

calendar agent, reply to a few emails

via your email agent, and even

negotiated a delivery update with your

vendors AI system all by itself. No

waiting, no follow-ups, just intelligent

agents talking to each other in a

digital language of their own. Sounds

great, right? But this is exactly where

technology is headed. A world powered by

AI agent protocols. Today we will break

down what AI agents are, how they

interact, why these protocols matter,

and what makes system like A2A, MCP, ACP

and others so important. So, first

things first, what exactly is an AI

agent? Think of it as a digital helper

that doesn't just follow commands, but

can understand goals, make decisions,

and take actions. Unlike a simple

chatbot, an AI agent can observe what's

happening around it, think or reason

about what to do next, and act either by

talking to another agent or by

performing a task. For example, a

customer support AI agent can understand

a complaint, talk to refund processing

agent, check the company's payment

system, and resolve the issue without

needing human help. But here's the

catch. For all the smart AI agents to

actually work together, they need a

common language. Think about how

computers talk to each other today. When

you visit a website, your browser and

the server don't guess what the other

means. They use standard internet

protocols like HTTP or TCP IP. These

rules make sure every message from your

YouTube video to bank transaction is

sent, received and understood the right

way. Now imagine a world of AI agents.

One built by Open AI, another by

Enthropic, another by Google, all trying

to collaborate without a shared

communication protocol. It's like having

a French agent trying to talk to a

Japanese agent with no translator in

between. Total chaos. That's where AI

agent protocols comes in. Just like

humans use spoken languages, AI agents

use protocols to share context, exchange

information, and coordinate actions

securely and consistently. These

protocols define what agents can say,

how they say it, and when they act on

it. They make sure that when one agent

says schedule a meeting, another agent

understand exactly what that means and

not just the words but the intent behind

them. Now when we talk about

communication, AI agents interact in

three main ways. So let's make this

simple. So we are going to talk about

the types of interactions in agentic

systems. So first of the list we have

agent to agent which is A to A. This is

when two agents talk directly. For

example, a travel booking agent might

ask a hotel reservation agent to find

available rooms. The A2A protocol

defines how they share request responses

and confirmations. Next, we have agent

to user which is AG UI. This is when the

agent talks to you the human like when

you chat with chat GPD, ask CD to set a

reminder or use Google assistant. Here

the focus on understanding intent and

making communication natural and human

friendly. Next we have agent to resource

which is A2R. This is when an agent

access external resources like

databases, APIs or files. For example,

your AI analyst agent might pull data

from a company database or a stock

market API to prepare a report. These

three types of interactions are the

backbone of how intelligence systems

think, talk and act. Now why protocols

matter? Imagine if every company built

their own agents in isolation. One using

lang chain, another using crew AI,

another built on open AI. So if they all

spoke different languages, none of them

could collaborate. It would be like

trying to run the internet without a

standard like HTTP. Complete chaos.

That's why standardized protocols are

essential. They give AI agents a shared

way to exchange the data in context,

coordinate actions, maintain trust and

transparency and collaborate securely

across systems. In short, I can say that

protocols are the glue that make

intelligent systems truly work together.

All right, now let's explore the five

key protocols shaping how AI agents

communicate with real examples so it's

easy to visualize. So again first on the

list we have A2A which is agentto agent

protocol. A2A defines how two or more

agents talk to each other, how they

share task, exchange results and build

trust. It sets the structure for message

exchange, trust verification and intent

sharing. It's like a meeting protocol

between agents. They introduce

themselves, share goals, negotiate and

agree on the next step all without human

involvement. So, it is used in multi-

aent environments. In logistics, a

delivery scheduling agent might talk to

a warehouse inventory agent to confirm

if products are ready before dispatching

a truck. Now, think of Microsoft Copilot

in Excel working together with Copilot

in Outlook. One analyzes your data while

the other summarizes insights in an

email draft. That's A2A communication in

action. Seamless collaboration between

agents behind the scenes. Next on the

list we have MCP which is model context

protocol. This protocol allows an AI

model like cloud or GPT to securely

connect to external tools like APIs and

databases without manual coding. In

simple terms, MCP helps AI models access

real world data safely and on demand.

and it is used whenever an AI needs

up-to-date or private data like

financial reports, product inventories

or real-time analytics. So, let's say

you're a sales manager. Your AI

assistant uses MCP to fetch live sales

numbers from your company's CRM, checks

targets, and generates insights

instantly. So, let's have a look at the

real world example. Anthropic cloud

models already use MCP to connect to

APIs, databases, or even Google Sheets,

allowing them to pull live information

rather than relying only on past

training data. It's what turns an AI

model into a contextaware decision

maker. Now comes ACP or agent

communication protocol, the backbone of

reasoning and goal sharing among agents.

ACP defines how agents express their

beliefs, intentions, and plans.

essentially their thought process. It

lets one agent tell another why it's

doing something, what it believes in

true, and what it plans to do next. And

it is used in large collaborative

systems where multiple agents must plan

together. In a smart city, different AI

agents manage traffic, energy, and

emergency systems. The traffic agent

might signal that it plans to reroute

cars due to congestion and the energy

agent adjust street light timings to

save power in that area. So now let's

have a look at the real world example.

Research lab at MIT and Stanford are

experimenting with ACP style

communication to build multi- aent

reasoning systems where agent plan

together instead of acting individually.

This is what enables collective

intelligence. Agents thinking as a team.

Now, what if these agents aren't all

inside one company, but spread across

the internet? That's where A&P or agent

network protocol comes in. It's designed

for peer-to-peer communication between

agents across different networks or

organizations. Kind of like a

decentralized messaging layer for AI. It

is used in large distributed systems

like global logistic, financial trading

or smart city networks where different

companies agents must cooperate. So

imagine several hospital using AI agents

to detect disease patterns. With AMP,

each hospital's agent can share

anonymous insights with others improving

predictions without revealing patient

data. So the real world example is

projects like fetch AI are already using

A&P like concepts where agents trade

data and digital services securely in

decentralized marketplaces. ALP is like

the internet of agents enabling large

scale collaboration. Finally, let's talk

about AGUI or agent user interface

protocol. The bridge between humans and

agents. This protocol defines how agents

understand and respond to human input.

Whether it's through text, speech or

visuals, it focuses on intent

recognition, clarity and explanability,

ensuring humans always stay in control.

It is used in every interface you are

interacted with like voice assistants,

customer chatbots or AI copilot in

software tools. So a financial advisor

AI might use AGUI principles to explain

why it suggest a certain investment

showing risk and returns in simple

visuals. So now let us have a look at

the real world example. Chant GPT,

Gemini and Meta's AI studio all follow

AGUI like standards focusing on tone,

clarity and transparency to make

interaction humanlike. In companies like

JP Morgans's AGUI style systems ensure

AI tools explain financial advice

clearly so users remain in control. I

hope this is clear now. Now let's see

how all of these work together in one

real world scenario. Imagine a global

e-commerce company. MCP lets its AI

agent pull live inventory and customer

data. A2A allows marketing and logistic

agents to coordinate automatically. ACP

ensures they plan campaigns and delivers

in sync. A&P connects these agents

securely with the partner companies

worldwide. And finally, AJUI delivers

the final insights to human managers

through a simple dashboard. All these

protocols together form a digital

nervous system where every agent and

every human stays connected and

informed. We are entering a time when AI

agents won't just live inside one app or

company. They will work across

platforms, industries, and even nations.

Imagine healthcare agents from different

hospitals sharing data securely to speed

up diagnosis or financial agents during

crossber compliance checks in seconds.

That's not science fiction. It's already

happening and it's only possible because

of protocols like A2A, MCP, ACP, A&P and

AGUI. They are not just technical

frameworks. They are the language of

intelligent collaboration. So next time

you hear about agentic AI, remember it's

not just about powerful models or

automation. It's about creating a world

where intelligent agents can talk,

think, and work together. A world

connected through these silent yet

powerful AI agent protocols. They are

the bridge between isolated AI systems

and truly connected intelligence.

Amazon just dropped a major AI upgrade,

Alexa plus, and it's unlike anything we

have seen before. It's not just an

update, it's a complete transformation

powered by generative AI. But what

exactly makes Alexa smarter, more

conversational, and more capable? Well,

in this video, we will break down how

Amazon has leveraged state-of-the-art AI

models to make Alexa a true AI

assistance, how it compares to

competitors like Chat DBT voice and

Google Assistants, and whether it's the

future of voice AI. Let's rewind a bit.

Alexa started as a simple voice

assistance in 2014. It could set

reminders, play music, and control smart

devices. But it had one major

limitation. It wasn't really thinking,

just following predefined rules. As AI

advanced, assistants like Apple Siri and

Google Assistants improve. But Amazon

saw an opportunity to turn Alexa into a

true conversational AI. And that's where

generative AI comes in. Enter Alexa

Plus, a brand new AI powered version of

Alexa that understands context,

remembers conversations, and sounds more

natural than ever. Launched on February

26, 2025, Alexa Plus is Amazon's next

generation AI assistance designed to

provide more natural conversational

interactions and enhanced capabilities.

This upgrade enables Alexa to perform

complex tasks such as planning events,

managing schedules, and controlling

smart home devices more efficiently.

Alexa Plus represents a significant

evolution from the original Alexa,

introducing several key enhancements. So

let us see what are they. First we have

conversational abilities. Alexa plus

offers more natural and expansive

interactions understanding colloquial

expressions and complex ideas making

conversational feel smoother and more

intuitive. Building on that it also

takes a more proactive approach to

assisting users. Unlike the original

Alexa, which primarily responded to

direct commands, Alexa Plus can

anticipate user needs such as suggesting

earlier dispatches due to traffic or

notifying about sales on desired items.

In addition, it has become more

personalized than ever. Alexa Plus can

remember user preferences, dietary

restrictions, and important dates,

tailoring responses and actions to

individual needs. Whereas the original

Alexa had limited personalization

capabilities. Beyond personalization, it

also enhances task management. The new

Alexa can handle complex task like

making reservations, ordering groceries,

and coordinating multiple services

seamlessly, surpassing the more basic

functionalities of the original Alexa.

Not just that, it also integrated with

more services than before. Alexa Plus

connects with a broader range of

services and devices including GrubHub,

Open Table, Ticket Master and various

smart home products making it even more

versatile. On top of all these

improvements, it now has the ability to

act independently. Agentic capabilities

is a notable advancements in Alexa plus.

Now that we have seen how Alexa plus has

improved, so let's dive into the

technology behind it and understand how

generative AI models and agentic AI

capabilities power this next generation

assistance. Alexa is built on

cuttingedge generative AI and agentic AI

leveraging powerful models and

algorithms to process language,

understand context, and execute task

autonomously. So let's break down the

key technologies that make this

possible. large language models which is

LLMs the brain behind conversations at

the core of Alexa plus is an advanced

transformer-based language model similar

to GPD4 cler and Amazon's preparatory

Titan model this LLMs are trained on

vast data sets allowing Alexa to

understand complex queries and respond

naturally also maintain context across

conversations making interactions feel

more fluid and generate humanlike

responses reducing robotic and

repetitative phrasing and by using

techniques like reinforcement learning

with human feedback, Alexa Plus

continuously improves its conversations

ability based on real world

interactions. The next technology is

agentic AI enabling proactive and

autonomous actions. Beyond just

responding to commands, Alexa plus

integrates agentic AI models which allow

it to act independently. Built on rag

and action models, it can plan

multi-step task example finding a

restaurant, booking a table, and

arranging transportation. It retrieves

real-time web data to provide the latest

information and execute action across

multiple apps and services without user

micromanagement.

This enables a fully autonomous AI

assistance experience, reducing the need

for manual user input. After agentic AI,

the technology that makes Alexa so

versatile is neural network

architectures enabling speech and

context awareness. Alexa plus utilizes

deep learning techniques such as

sequencetose sequence models for natural

language generation but which stands for

birectional encoder representations from

transformers for understanding user

intent with greater accuracy. Next,

Whisper ASR automatic speech recognition

for improved voice processing, making

Alexa more responsive to different

accent and speech patterns. These

advancements enable highly accurate

speech recognition, contextually

understanding, and real-time adaption to

user behavior. Alexa Plus integrates

long-term memory storage using vector

databases like FISS or Amazon Aurora,

allowing it to remember user preferences

over time, adapt to individual habits

and routines for a more personalized

experience, also provide contextual

reminders based on past interactions.

This deep personalization is what makes

Alexa Plus feel more like a true digital

assistant rather than just a voice

control device. Then comes the

technology that makes Alexa plus capable

of understanding and interacting with

user which is multimodel AI. Alexa plus

leverages multimodel AI combining

natural language processing for

textbased queries, computer vision for

eco show devices enabling it to process

and analyze on screen content. Also this

speed synthesis is used to generate

humanlike voice responses and this makes

Alexa place capable of understanding and

interacting with user in multiple ways

enhancing its overall functionality and

by combining LLMs agentic AI deep

learning models and real-time data

retrieval. Alexa represent a significant

leap in AIdriven virtual assistance. It

is no longer just a voice assistance. It

is an autonomous context aware and

highly personalized AI companion

designed to make daily life easier. Now

that we have explored the technology

behind Alexa plus, so let us see how it

stack up against other leading AI

assistants. Alexa plus enters the AI

assistant space with generative AI and

agentic AI making it smarter and more

[clears throat] proactive. But how does

it compare to the top AI models

available today? So let's break it down

across key aspects. So we will compare

them based on five key factors such as

AI powered and capabilities,

personalization and memory. Then

proactive and autonomous task. Next is

the ecosystem and third party

integration and finally conversational

abilities. So first let us compare it

with AI power and capabilities. So how

powerful is the AI behind each

assistant? Alexa plus uses Amazon Titan

plus custom LLMs with generative AI and

agentic AI for smart proactive

responses. Whereas chat GPT voice runs

on GPT4 great for deep conversation but

lacks real world task execution. Whereas

Google assistants uses Gemini AI best

for search and multimodel inputs such as

text, voice and images. And Apple Siri

uses Ajax LLM improving in language but

still rule based and limited. So, Alexa

plus leads in proactive AI while GPT4

dominates in conversation. Next, let us

compare it in terms of personalization

and memory. So, can the assistant

remember your preferences and adapt? Let

us see. So, Alexa plus long-term memory

of routines, preferences, and contextual

adaption. Charge voice limited memory

resets after sessions. Whereas Google

assistants remembers preferences inside

Google apps but lacks deep

personalizations. Apple Siri is minimal

memory. Mostly relies on Apple's preset

commands. So, Alexa Plus leads in

remembering and adapting to users. Next

is the proactive and autonomous task

execution. So, can it handle task on its

own? Let's see. Alexa plus uses agentic

AI for multi-step automation. For

example booking ordering and

reminders. Chat GPD voice assist with

planning but can't perform real world

automation. Google Assistants can set

reminders and retrieve information but

lacks deep automation. Whereas Apple

Siri limited to commands relies on

shortcuts for basic automation. So here

the Alexa Plus is the most proactive

handling task automatically. Next is the

ecosystem and third party integration.

How well does it work with other devices

and apps? Well, Alexa plus is best for

smartome such as Amazon Eco Ring third

party integrations. Chantity voice can

connect to some external tools but no

smart home control. Google assistance is

deep integration with Google apps and

services whereas Apple Siri is limited

to Apple devices with minimal third

party support. So here again Alexa plus

and Google Assistant lead but Alexa has

better smart home control and finally

conversational abilities. So how natural

and humanlike are the conversations?

Alexa plus is natural, expressive and

context aware. Chat GPD voice is best

for deep intelligent conversations.

Google Assistance is accurate but more

search focused. Apple series still

command based with limited depth. So

chat GPD voice is best for deep

conversations but Alexa plus is most

natural for voice interactions. So now

let us see the future of AI assistance.

Let's see what's next. So here we have

smarter AI memory. assistants will

remember and personalized even better.

Next, more autonomy. AI will handle

complex multi-step task independently.

And then more humanlike conversations.

AI will feel more natural and intuitive.

Next, seamless integration. AI will

connect effortlessly across devices and

services. Next, realtime decision

making. AI will anticipate needs and

offer proactive help. So Alexa plus is

best for automation, memory and smart

home control. Whereas chativity voice is

best for deep intelligent conversation.

Google assistance is best for search and

Google productivity. Whereas Apple Siri

is best for Apple users but still

limited in AI features. And Alexa plus

is not just an upgraded. It's a

redefinition of AI assistance with

generative AI and agentic AI for smarter

proactive help. So what do you think?

Which AI assistance is your favorite?

Let me know in the comments below.

[music]

Did Alibaba just do the impossible?

Their latest AI model has outperformed

both GP4 and DeepSync in some key

benchmarks. But how did they manage to

do it? And what does this mean for the

AI race? Stick around as we dive into

the shocking details behind this

breakthrough and what it means for the

future of AI. Alibaba just dropped a

bombshell during the Luna New Year. a

new AI model called Quinn 2.5 Max and

they say it outperforms Open AI's GPT40,

Meta's Lama, and even China's own rising

star Deeps. Is this the new benchmark in

AI? Let's break it down. First off, what

exactly is Quen 2.5 Max? Developed by

Alibaba Cloud, this model is being hyped

as a major rival to GPD 40. According to

their benchmarks, it crashes competitors

in reasoning, coding, and multilingual

task. So let's look at the numbers. In

Arena Hard, a benchmark for complex

problem solving, Quinn 2.5 max scored

85.3%.

Beating GPT4's 80.2% and Deep 6 V3

77.5%.

But here's the twist. It's not just

about the raw power. Alibaba built this

model for businesses. Think customer

service bots that speak 10 languages or

AI coders that debug Python faster than

your engineering team. And unlike

OpenAI's premium pricing, Alibaba's

offering Quen 2.5 Max at a fraction of

the cost. But why drop this during the

Lunar New Year when half of the China's

on vacation? Well, that's where the

discussion begins. Meet Deeps, the

20-month-old startup that's being

shaking up Silicon Valley. 3 weeks ago,

they dropped DeepScri and the R1 model.

And the secret is insanely low cost. We

are talking $0.14 million per tokens.

That's like charging pennis for a

Lamborghini. Deep's cheap open-source

models triggered an AI price war in

China. Alibaba reduced price by up to

97% overnight, but deep founder Young

isn't sweating it. In a raid interview,

he said, "We don't care about the price.

Agi is our goal. Agi that's artificial

general intelligence. AI that can

overthink humans." And here's the twist.

Deepseek isn't some corporate giants.

They are a tiny team of a grand students

and researchers working out of Alibaba's

hometown, Hongu. Meanwhile, Alibaba's

got 200,000 employees. So, how does Quen

2.5 Max actually stack up? Let's

compare. First, let's talk about

reasoning. Quinn takes the lead here.

Next, when it comes to coding, Quinn

continues to shine. And if multilingual

support is what you are after, Quinn

speaks Mandarin, English, Spanish or

whatever you name it. But Deepsec W3

still holds the crown for affordability.

And GPT4, it's holding on to its

reputation. But even open AI Sam Oldman

admitted deep progress is impressive.

But Alibaba's timing is strategy.

Releasing Quen 2.5 Max during Luna New

Year when everyone's distracted is a

power move. It's like dropping a diss

track on the Christmas day. No one's

looking, but everyone will hear it. So,

here's why this matters. Deepix R1 model

wiped $1 trillion of US tech stocks in a

day. Nvidia, Meta, Microsoft all

dropped. Why? Because if a tiny Chinese

startup can match GP4 at 100 the cost,

inventors wonder, are we overspending on

AI? And China's giant aren't sitting

still. Bite Dance updated its AI model

days after Deep Six launch. 10 cent and

BU are in the price cutting frenzy.

Meanwhile, Alibaba's betting big on

Quinn to dominate enterprise AI. Think

hospitals, banks, and mega corporations.

And the real question is who's closer to

AGI, Deep Seeks agile team or Alibaba's

corporate powerhouse? Share your

thoughts in the comments. So, is Quinn

2.5 Max the new AI champion? Maybe. But

this isn't just about benchmarks. It's a

glimpse into the future. A future where

AI isn't just built by Silicon Valley

giants, but by startups in Honguk and

open-source communities worldwide.

A game-changing development has taken

the tech world by surprise.

>> I think we should take the development

out of China very very seriously. A free

open-source AI model emerged seamlessly

out of nowhere. It not only matched but

surpassed some of the most advanced

systems on the market. What made this

even more remarkable was its origin. It

wasn't a new release from OpenAI nor a

breakthrough from Anthropic. It was a

deepse and AI model developed in China

and its development left top AI

researchers in the United States in

amazement especially when they learned

about the staggering cost behind it.

>> It's opened a lot of eyes of like what

is actually happening in AI in China.

The training cost for Deepseek version 3

was just $5.576 million. And in

comparison, OpenAI spends a massive $5

billion annually. While Google's capital

expenditures are projected to exceed $50

billion by 2024.

Microsoft on the other hand invested

over $13 billion just in OpenAI. And yet

deep model outperformed these highly

funded AI models from leading American

companies. And the contrast is truly

mind-blowing

>> to see the deepseek um um new model.

It's it's super impressive in terms of

both how they have really effectively

done an open source model that does what

uh is this inference time compute and

[music] it's super compute efficient.

>> Deepig didn't stop at the success of its

powerful open-source AI model. Instead,

it quickly introduced R1, a next

generation reasoning model that has

already surpassed the advanced OpenAI W1

model in several third party benchmarks.

This rapid innovation highlights Deep

Seek's ability to surpass even the most

well-funded United States AI giants,

proving that agility and creativity can

disrupt the established leaders in the

race of the AI dominance. As we dive

deeper into this, let's hear from Martin

Wishov, the director of Bulgarian

Institute for Computer Science,

Artificial Intelligence, and Technology.

He recently made some interesting

statements about the AI industry, and

they are shaking things up. So, he

pointed out that a Chinese AI startup

claimed to have developed its R1 LLM

model with less than $6 million while

other companies are pouring in billions.

That announcement alone caused Nvidia

stock to drop. And according to Martin,

these models are built by strong

researchers and engineers in the field,

many of whom actively publish their

work. But developing these AI models can

be incredibly expensive. Just to give

you an idea, running 2048 H800 GPUs

could cost anywhere between 50 to$100

million. And he also mentioned that the

company handling the data center is

backed by a massive Chinese investment

fund with far more GPUs than just those

2048 H800 units. As for the architecture

behind Deep6 R1 and V3 models, Martin

explained that they use a mixture of

experts MOE approach. Simply put, this

means that at any given time only a

small percentage of the model is active,

making it much more efficient in

realtime use. This rises a lot of

question about the cost efficiency AI

development strategies and how companies

are competing in this space. Now the

question is if deepix development is

being reported to have cost only 5 to6

million how does this figure align with

the extensive infrastructure data center

operations and substantial backing from

Chinese investment funds? Could there be

more to the story that isn't being

disclosed? Let us know in the comment

section below. As far as the research

indicates, DeepSc V3 has been utilized

as the base model for DeepSync R1 and

this progression highlights DeepSc

strategic approach to building on its

existing architecture while pushing the

boundaries of AI capabilities. Deepsec

R1 distinguishes itself by relying

entirely on reinforcement learning

fine-tuning, a focused and efficient

method that constructs sharply with

OpenAI's GPT infrastructure. Open AI's

GPT generative pre-trained transformer

framework employs a combination of

supervised learning, unsupervised

learning, and reinforcement learning to

train its models. While this

multifaceted approach has proven

effective, it also requires significant

computational resources and time. In

contrast, DeepSync R1's exclusive use of

reinforcement learning fine-tuning

demonstrates a more streamlined and

targeted methodology which not only

reduces cost but also enhances

performance in specific task. This

differences in training strategies

highlights DeepS's remarkable ability to

innovate efficiency and by building on

its foundational V3 model, it developed

R1, a reasoning model that has already

surpassed OpenAI's advanced systems in

some key benchmarks. Deeps focus on

reinforcement learning has allowed it to

carve out a unique position directly

challenging the dominance of United

States AI giants. This approach

demonstrates that with strategic

resource conscious innovation

groundbreaking results are not only

possible but they are already happening.

So what do you think? Does DeepSc

opensource approach give it a long-term

advantage? Or will OpenAI's heavy

investment in research and proprietary

models keep it ahead? Share your

thoughts in the comments.

[music]

Did a Chinese AI model just shake up the

entire US market? Let me tell you what

happened. On January 27, 2025, the stock

market took a serious hit. Tech stocks

drop hard and the biggest shock was that

the Nvidia, a prominent player in the AI

hardware sector, saw its stock crash by

17%. And that's a $590 billion loss in a

market value. And it was because of an

AI model called Deepseek R1. Yes, you

heard that right. A Chinese AI model

just sent shock waves through the

industry, raising big concerns about

China's growing AI dominance and what it

means for companies like Open AI,

Google, and even Nvidia. So why is

DeepSc R1 such a big deal? Well, it's

not just another AI model. It's a

gamecher. And with models like DeepSync

R1, DeepSc V2 and DeepSc Coder and it's

going head-to-head with the top layers

like Open AAI and Google offering

powerful AI at the fraction of the cost.

Here I have added a screenshot for your

reference and this table compares the

large language models based on their

accuracy and calibration error. And

among the models mentioned, Deepseek R1

has the highest accuracy with 9.4%.

Or performing 01 with 9.1%.

Gemini Thinking with 6.2% and other

models such as GPT40 with 3.3% and Group

2 with 3.8%.

Furthermore, Deepseek R1 has the lowest

calibration error with 81.8%.

Indicating improved confidence

calibration over other models with

errors greater than 88%.

This demonstrates that DeepSync R1 not

only produces the most accurate results

but also has a higher forecast

reliability and this benchmark graph

will show the DeepSc R1's expectational

performance across a variety of

evaluation task solidifying its position

as a top tire LLM. Notably, DeepSync R1

achieves the highest scores in AIM 2024

with 79.8%.

code forces with 96.3%,

math 500 with 97.3%

and MMLU with 90.8%.

Indicating superior reasoning, problem

solving and coding skills when compared

to OpenAI's 01 models and Deepsec V3

DeepSc R1 consistently outperforms or

equals top models particularly in

domains requiring precise logical

reasoning and mathematical skills.

Furthermore, its highest rebench

verified score is 49.2%

demonstrates its suitability for

software engineering applications and

this finding supports DeepS R1's

advancements in AI research establishing

it as a formidable competitor in the LLM

space. First, let's talk about DeepSc

R10 and its successor DeepSc R1. So,

let's break it down. In reinforcement

learning, there are two main components,

the agent and the environment. The agent

interacts with the environment and based

on its actions, it receives rewards or

penalties. The goal of the agent is to

maximize these rewards by learning from

its mistakes and improving over time.

Now, let's talk about the DeepS R10.

This model was a pioneering attempt to

use reinforcement learning without

supervised fine-tuning. And the idea was

to let the model learn entirely through

interaction with its environment without

any pre-labelled data. However, this

approach had some challenges. Deepsec

R10 faced two major issues. First is the

poor readability. The models outputs

were often hard to understand and next

the language mixing with Chinese. The

model sometimes mixed languages

especially Chinese which affected its

performance in English task. And to

address these issues, the team

introduced Deepseek R1. This new model

not only solved the problems of

readability and language mixing but also

achieved remarkable performance. In

fact, DeepSync R1 matched the accuracy

of OpenAI's GPD01 model, especially the

OpenAI 01217 model on reasoning task.

That's a huge milestone. But that's not

all. Deepc R1 is also 24 to 28 times

cheaper to train compared to other

state-of-the-art models. And this makes

it not only highly effective but also

costefficient opening up new

possibilities for research and

applications. So to recap here we have

the DeepSync R10 was an ambitious

attempt to reinforcement learning

without supervised finetuning but it

faced challenges with readability and

language mixing. Deepsec R1 addressed

these issues achieving top tier accuracy

and being significantly most cost

effective. Now as far as GPD is

concerned, CH GPD combines unsupervised

learning, supervised fine-tuning and

RLHF making it more aligned for

textbased reasoning and safe AI

interactions.

Now we will look at the model

comparison. Deep seek and GPT are both

pushing the boundaries of what AI can

do, but they take a very different

approaches. So let's break it down. We

asked the GPD01 model and DeepSc R1

model to generate a Python code where a

ball bounces inside a rotating triangle.

Sounds cool, right? Well, let's check

out the result.

First up, here's what GPD01 came up

with. It works, but the physics seems a

bit off and the moment isn't as smooth

as you would expect. Not bad, but it's

not quite there yet. Now, let's look at

what DeepSc R1 generated.

Wow, this one looks way better, right?

The ball's movement feels more natural

and the rotation of the triangle is much

smoother. The overall gameplay

experience is just more polished. So, if

we compare the two, DeepSc R1 definitely

outperform GPD01 in this challenge. Of

course, both models are impressive in

their own ways, but when it comes to

designing this specific game, DeepSc R1

takes the win. Now, we will distinguish

the differences in detail. So first

let's talk about the architecture.

Deepseek uses a mixture of experts

design. Think of it like a team of

specialists. Only the relevant experts

are activated for each task. So for

example, DeepSc V3 has 671 billion

parameters but only 37 billion are

activated per token making it super

efficient. And on the other hand, GPD

model uses a dense transformer

architecture where all the parameters

are active at once. GPD 4 for instance

has 175 billion parameters all working

simultaneously and this makes GPD

powerful but also computationally

expensive. Now let's talk about cost and

efficiency. Deepseek is a gamecher here.

It was developed on the budget of just

$5.5 million due to its efficient design

and that's a fraction of what other

models cost. GPT models like GPT4

requires massive computational

resources. Training GPD4 cost over

hundred billion dollars, making it a

heavyweight in terms of both performance

and expense. And when it comes to

performance, both models shine in

different areas. Deepsec is a powerhouse

in task like coding, translation, and

solving complex math problems. And in

fact, DeepSc R1 has been shown to match

the performance of advanced systems from

Open AI and Google despite its smaller

budget. GPD models like GPD4 are known

for their natural language

understanding, creative writing and

complex reasoning and they are

incredibly versatile and can handle a

wide range of task with ease. Next,

accessibility is another key difference.

Data is an open-source meaning its code

is available to the public and this

promotes transparency, collaboration and

innovation within the AI community. GPD

models on the other hand are primarily

proprietary. While open AI has released

some tools and models, many advanced

versions are restricted and accessed

through APIs. Finally, let's talk about

the ethics and censorship. Deepsec

implements strict content moderation,

especially for politically sensitive

topics. This ensures compliance with

regulatory standards, but can sometimes

limit its responses. GPD models also

have moderation mechanism to prevent

harmful outputs. But they strive to

balance open access to information with

ethical guidelines. Now we will install

the DeepSc R1 and run a short demo on

it. So let's see how the DeepSc R1 model

can be installed. First let's open up

our browser and head over to the

ola.com. And once you're there, you will

see a download button. So go ahead and

click on that. Now select download for

Windows to start the download. And keep

in mind it's a pretty big file. So it

might take a minute to download. So

let's give it some time.

Once the download finishes, go to the

download folder and find the

installation file. Now here, double

click on the file to open the

installation window. And you will see an

install button. So simply click on that

and Ola will start installing on your

system.

So once Ola is installed, let's head

back to the Ola website. Now click on

the models tab at the top of the page

and here you will see a list of

available models. And for this video we

are going to use the DeepSync R1 model.

So we will select the 1.5B model but if

you want you can also choose the latest

7B model too. Now once you have selected

the model you will find the installation

command here. So go ahead and copy that

command. Now open your terminal on your

Windows system and paste the command we

copied earlier. So this is the command

and this command will start pulling the

model. So depending on your internet

speed, this might take a little time. So

be patient

and that's it. Once the process is done,

your DeepSc R1 model will be all set up

and ready to use on your system. All

right. Now let's try out some commands

here. So let's say hello.

Okay, we got some response here. Now

let's ask it to tell something about

himself.

All right, so it responded saying that I

am Deepseek R1 and AI assistant. Next,

let's ask it to design a Python code to

list all the files in a directory.

So as you can see, it has provided up

the code we asked for. Great, right? So,

what does all of this means for the

future of AI? Well, AI is becoming more

accessible. Like, for years, AI

development was mostly controlled by big

companies with massive budgets. But now,

with open-source models like DeepSeek

R1, anyone whether you have a small

startup or you are an independent

developer or a student, you can build AI

solutions without paying huge API fees.

This means faster adaption and

innovation worldwide. Next, the global

AI race is heating up. The AI race

between the United States and China is

getting even more competitive. While the

United States tries to limit AI chip

exports, China is finding ways to keep

up. No matter which side you support,

one thing is clear. AI is evolving

rapidly and staying informed is more

important than ever. Next, AI is

becoming more sustainable. Training AI

models consume massive amounts of

energy. But with advancements in

optimization, we are seeing a shift

towards more efficient and eco-friendly

AI. This means lower CO2 emissions and a

reduced environmental impact. Something

that was once a major concern in AI

development. Next, career opportunities

are growing. And if you're a developer,

AI engineer, or a data scientist, this

is your moment. Companies will need

skilled professionals to build and

deploy AI solution at a faster pace than

ever. So if you have been thinking about

getting into AI, now is the time to

start. Well, Deepseek is making big

moves, but can it really compete with

open AI in long run? Which one do you

think will dominate the future of AI? So

let me know your thoughts in the

comments below. [music]

AI is no longer just responding. It's

acting, planning, and automating entire

workflows. Welcome to the era of agentic

AI, where AI agents can write code, run

businesses, and make decisions without

human input. By 2030, AI automation is

projected to be a $200 billion industry.

And those who master agentic AI tools

like AutoGPT, Devon AI and Langchain

will lead the future. Now let's dive

into the ultimate road map to mastering

agentic AI. So first let's see how you

can build a strong foundation in

generative AI. To truly master agentic

AI, you need a strong foundation in

generative AI. Understanding how AI

models work, their evolution, and their

impact on automation. So start by

exploring how AI has evolved from

rule-based systems to advanced models

like chart GPD, autogen.

You can check out Edurea's video on what

is generative AI and generative AI

examples for valuable insights into the

fundamentals of generative AI, its real

world applications and how it is

transforming various industries. So

first understand the core concepts of

agentic AI where AI can perceive, plan

and act independently to automate

complex workflows. Next learn about real

world applications such as business

automation, AI powered software

engineering and autonomous research

agents. And to deepen your knowledge

familiarize yourself with the key AI

models like GPD4 Turbo Cloud AI, Gemini

and Mistral. and stay updated on multi-

aent systems and self-improving AI

trends. And for hands-on exploration,

leverage open AI's API, Clut AI and Lama

3, or experiment with different AI

models on hugging face spaces.

You can also stay updated with AI

research papers from archive and hugging

face to keep up with the latest

breakthroughs. Edureka's generative AI

certification and training will teach

you Python programming, data science,

artificial intelligence, natural

language processing and so many other

updated technologies that a beginner or

advanced learners is seeking. And by

understanding this concepts and

experimenting with these tools, you will

have a strong foundation to start

working with agentic AI. Next, let's

dive into the programming for AI. To

build and experiment with agentic AI,

you need to understand the fundamentals

of programming, especially in Python,

which is backbone of AI development.

Start by learning Python basics,

focusing on data structures, loops,

functions, and object- oriented

programming.

Then explore essential AI and machine

learning libraries like NumPy and Pandas

for data manipulation, mattplot, lib and

seaborn for data visualization, and

tensorflow and pytorch for deep

learning. To work with AI agents, you

must also understand API interactions as

most AI tools like OpenAI's API, lang

chain, and hugging face models require

API calls. Additionally, learning

automation with fast API, flask, and web

scraping can help you integrate AI into

real world applications. For hands-on

practice, start small projects like

building a chatbot, creating an AI

powered summarizer, or automating data

analysis. You can also explore Edurea's

Python training and certification course

designed by industry experts where you

will learn Python from scratch along

with key libraries like numpy, pandas,

mattplot lib and scikit learn through

hands-on project and real world

applications. With this powerful

promptic techniques and tools, you will

be able to optimize AI responses and

unlock the full potential of agentic AI.

To leverage agentic AI, start

experimenting with cuttingedge tools

that enable autonomous workflows. Auto

GPT and crew AI allows you to create

multi- aent AI systems where AI agents

collaborate to complete task. Baby AGI

is perfect for automated research and

decision-m helping AI iterate on task

dynamically. Deaving AI, the first AI

software engineer, showcases how AI can

independently write, debug, and deploy

code for hands-on learning. Build real

world projects like AI powered

automation assistance, autonomous

research tools, or self-improving

chatbots to see agentic AI in action. By

working with these tools, you will

understand how AI can move beyond just

responding to acting intelligently and

autonomously.

Next, explore land chain and rack.

Powerful tools that give AI the ability

to retrieve real-time information,

process external data, and enhance

decision making to build more powerful

and contextaware AI applications.

Understanding lang and rack is

essential. Langchain is must learn

framework that enables seamless

integration of LLMs with external data

sources allowing AI agents to interact

with APIs, databases, and documents. Rag

enhances AI models by providing memory

and real-time knowledge retrieval,

making responses more accurate and

upto-date. For hands-on learning, try

building your own AI chatbot with lang

chain capable of retrieving real-time

information instead of relying on static

training data. A great project idea to

explore is an AI powered research

assistant capable of summarizing papers,

fitting real world data, and answering

domain specific questions. And to dive

deeper, check out our dedicated video on

lang chain and rag where we cover

everything in detail. Next, here are the

extra tips for your success. To excel in

agentic AI, consistent practice and

community engagement are key. So, start

by pushing your AI projects to GitHub

and using version control like Git to

track your progress and collaborate.

Join AI communities on Discord, Twitter,

and Hugging Face spaces where you can

interact with experts and stay updated

on trends and get feedback on your work.

Take advantages of AI internships and

open-source projects to gain real world

experience and build a strong portfolio.

Also stay updated by regularly reading

AI research papers on archive and Google

Scholar, keeping up with the latest

advancements in multi- aent AI and

automation. And by following these extra

tips, you will accelerate your AI

learning with career growth.

[music]

So let's begin our deep learning

interview questions and answer session

and understand what are the typical

questions which are being asked in deep

learning interview. So the first and

foremost question what any deep learning

interviewer asks is the basic

understanding or the relationship

between machine learning artificial

intelligence and deep learning. So

basically artificial intelligence is a

technique which enables machine to mimic

human behavior and machine learning is a

subset of artificial intelligence

technique which uses statistical methods

to enable machines to improve with

experience. Now deep learning on the

other hand is a subset of machine

learning which makes the computational

multi-layer neural network feasible. It

uses neural networks to simulate

humanlike decision making. Now coming to

the second question. Do you think deep

learning is better than machine learning

and if so why? Now though machine

learning algorithms the traditional

machine learning algorithms solve a lot

of our cases but they are not very

useful while working with

highdimensional data. Now that is where

we have a large number of inputs and

outputs. For example in case of

handwriting recognition we have large

amount of inputs where we have different

types of input associated with different

types of handwriting. Now another major

challenge is to tell the computer what

all features it should look for that

will play an important role in

predicting the outcome as well as to

achieve better accuracy while doing so.

So these are some of the few

shortcomings what machine learning have

and deep learning overcomes all of these

shortcomings. Now coming to our third

question which is what is a perceptron

and how does it work? Now actually our

brain has subconsciously trained itself

to do a lot of things over the years.

Now the question comes how does deep

learning mimics the functionality of the

brain? Well, deep learning uses the

concept of artificial neuron that

functions in a similar manner as the

biological neuron present in our brain.

Therefore, we can say that deep learning

is a sub field of machine learning

concerned with algorithms inspired by

the structure and the function of the

brain called artificial neural networks.

Now, if you focus on the structure of a

biological neuron, it has dendrites

which is used to receive inputs. Now

these inputs are summed in the cell body

and using the axon it is passed on to

the next biological neuron. Now

similarly a perceptron receives multiple

inputs applies various transformations

and functions and provides an output.

Now a perceptron is a linear model used

for binary classification. It models a

neuron which has a set of inputs each of

which gives a specific weight. Now the

neuron computes some function on these

weighted inputs and then finally it

provides the output. As we know that our

brain consists of multiple connected

neurons called the neural network. We

can also have a network of artificial

neurons called the perceptron to form a

deep neural network. Now coming to the

next question, what is the role of

weights and biases? Now for a

perceptron, there can be one or more

input called bias. While the weights

determine the slope of the classifier

line, the bias allows us to shift the

line towards left or right. And normally

bias is treated as another weighted

input with the input value x. In our

case, if you have a look at a typical

perceptron, what it receives is a set of

input. Now these inputs are not just

input which it gathers. So weights are

an additional input which it takes and

according to that it computes and

provides an output. Now which brings us

to the next question which is what

exactly are activation functions? So

activation function translates the

inputs into outputs and it uses a

threshold to produce an output. So the

activation function decides whether a

neuron should be activated or not by

calculating the weighted sum and further

adding the bias with it and the purpose

of the activation function is to

introduce a nonlinearity into the output

of a neuron. There can be many

activation functions like linear or

identity. We have the binary step. We

have sigmoid. We have the tan. We have

relu and soft max. These are a lot of

activation functions which are being

heavily used in the deep learning

industry. So one should actually know

about all of these things. Now talking

about perceptron, our next question what

an interviewer might ask is explain the

learning of a perceptron. So basically a

perceptron has four steps of learning.

So the first steps is initializing the

weights and threshold. So just now as I

mentioned initializing the weights and

the threshold so to the perceptron so

that it can activate a neuron by

calculating the weighted sum and further

adding the bias in and all. This is the

first step and the second step is

providing the input and calculating the

output using the activation functions.

And according to that what we do is the

third step involves updating the

weights. Now once a particular

perceptron learns something it has to

update the weights so that it could

learn much more things in a new manner.

And the next step what comes is just

repeat the step number two and three

which is provide the input and calculate

the output and then update the weights

accordingly. Now if you have a look at

the equation here we have wj t + 1 that

equals wj of t plus n of t - yx the wj

of t + 1 is the updated weight whereas

wj of t is the old weight d is the

desired output y is the actual output

and x is the input. So this is the

equation of the learning of a

perceptron. Now the next question is

what is the significance of a cost or a

loss function. So a cost function is a

measure of accuracy of the neural

network with respect to a given training

sample and expected output. It provides

the performance of a neural network as a

whole. And in deep learning the goal is

to minimize the cost function. So for

that we use the concept of gradient

descent. Now which brings us to the next

question which is what exactly is

gradient descent and what are its

various types. So gradient descent is an

optimization algorithm which is used to

minimize some function by iteratively

moving in the direction of the steepest

descent as defined by the negative of

the gradient. Now think of it as a bowl

in which you start from any particular

point and the goal is to reach the

bottom of the bowl which is the gradient

descent. So there are uh three types of

gradient descent which are the

stochastic batch and the mini batch. So

stochastic gradient descent it uses only

single training example to calculate the

gradient and update the parameters

accordingly. Whereas the batch gradient

descent calculates the gradients for the

whole data set and performs just one

update at each iteration. Now mini batch

gradient descent is a variation of the

stochastic gradient descent where

instead of single training example mini

batch of samples are used and it is one

of the most popular optimization

algorithm. Now if we talk about mini

batch gradient descent one might ask is

what are the benefits of the mini batch

gradient descent or how is it useful

than the others. Now the mini batch

gradient descent is more efficient when

compared to the stoastic gradient

descent and the generalization is done

by finding the flat minima which allows

to help approximate the gradient of the

entire training set which help us to

avoid the local minima. Now this is why

many batch gradient descent is

considered or is preferred over the

regular gradient descent algorithm which

is the stoastic gradient descent. Now

one might ask what are the steps for

using a gradient descent algorithms. So

first of all what you need to do is

initialize some random weight and bias

and after that you need to do is pass an

input through the network and get values

from the output layer. Next what you're

going to do is calculate the error

between the actual value and the

predicted value. Now this can be done in

number of ways. Now the next step

involves is to go to each neurons which

contributes to the error and change its

respective values to reduce the error

which is basically our goal is to reduce

the cost of any particular function or

any particular model. So after that what

you do is reiterate until you find the

best weights of the network and you find

the lowest cost of the particular

network. So one might ask you to write

any gradient descent program or write

the pseudo code of any grain descent

program. So what you need to do first of

all what we do is define the parameters

which are the weights the hidden weights

the weight output the bias hidden and

the bias output. We define a function

std with arguments as cost the

parameters what we have discussed and

the learning rate. Now what we do is

then we then define the gradients of our

parameters with respect to the cost

function. So here we use the theano

library to find the gradients and we

import theo as t and finally iterated

through all the parameters to find out

the updates for all the possible

parameters. So you can see that we use

vanilla gradient descent here and as you

can see it returns the updates and what

we do is update the parameters and the

cost in this particular equation. The

ultimate goal of any grain descent

algorithm is to minimize the cost. Now

talking about perceptron what are the

shortcomings of a single layer

perceptron. So well there are two major

problems. Now first of all is that the

single layer perceptron cannot classify

nonlinear separable data points and the

second point is that the complex

problems that involve a lot of

parameters cannot be solved by a single

layer perceptron.

Now consider an example here and the

complexity which arises when the

parameters are involved to take a

decision by a marketing team. So first

of all we have the categories which are

the email direct paid refer program or

the organic and inside these category we

have subcategories which are the Google,

Facebook, LinkedIn, Twitter we have

Instagram now and inside that we have

the type of subcategory which are the

search ads, remarketing ads, interested

ads, lookike ads and again if we do a

subdivision we have the parameters to

consider which are the customer

acquisition cost, we have the money

spent and the click rate or the lead

generated the customer generated and the

time taken to become a customer. So one

neuron cannot take in so many inputs and

that is why more than one neuron would

be used to solve this problem. Now which

brings us to the question what is a

multi-layer perceptron. So a multi-layer

perceptron or MLP is a class of feed

forward artificial neural network and it

is composed of more than one perceptron.

They are composed of an input layer to

receive the signal. An output layer that

makes a decision or the prediction about

the input and in between these two an

arbitrary number of hidden layers that

are the true computational engine of any

multi-layer perceptron. Now one might

ask what are the different parts of any

multi-layer perceptron or a neural

network. So first of all what we have

are input nodes. So uh the input nodes

provide information from the outside

world to the network and are together

referred as the input layer. No

computation is performed in any of the

input nodes. They just pass the

information to the hidden layers. Now

hidden nodes have no direct connection

with the outside world. Hence the name

hidden. And what they do is they perform

computation and transfer the information

from the input nodes to the output

nodes. Now a collection of hidden nodes

forms the hidden layer and while a

network will only have a single input

layer and a single output layer it can

have zero to n number of hidden layers

and a multi-layer perceptron has more

than one hidden layer. Now if we talk

about output nodes, the output nodes are

collectively referred to as the output

layer and are responsible for the

computation and transferring information

from the network to the outside world

and hence and hence they are also

responsible for the prediction. Now

coming to our next question, what

exactly is data normalization and why do

we need it? Now data normalization is a

very important pre-processing step which

is to normalize the data. The data

should not be either left skewed or

right skewed. It should be normal and is

used to rescale the values to fit in a

specific range to assure the better

convergence during back propagation. And

in general it boils down to subtracting

the mean of each data point and dividing

by its standard deviation so that we get

a normally distributed data and it makes

computation easy in terms of the back

propagation in case of neural networks.

So this is a very important part of any

deep neural network. Now talking about

deep neural networks so or neural

networks in general. Coming to our next

question which is now what is better the

deep networks or the shallow ones and

why? Now both the networks be it shallow

or deep are capable of approximating any

function. What matters is how precise

that network is in terms of getting the

result. Now a shallow network works with

only a few features as it cannot extract

more. But a deep network goes deep by

computing efficiently and working on

more features or the parameters. Now

deeper networks are able to create deep

representation at every layer. The

network learns a new more abstract

representation of the input and hence

deep neural networks are better than the

shallow ones. So what exactly is weight

initialization in a neural network. Now

as we saw we had weight initialization

in perceptron. So weight initialization

is one of the very important steps. A

bad weight initialization can prevent a

network from learning but good weight

initialization can help it in giving

quicker convergence and a better overall

error. Now biases can be generally

initialized to zero. The rule for

setting the weights is to be close to

zero without being too small because

every time the weight is being

multiplied to the inputs, the result

gets smaller and smaller. Now talking

about neural networks, what is the

difference between a feed forward and a

back propagation neural network? Now a

feed forward neural network is a type of

neural network architecture where the

connections are fed forward that is they

do not form cycles. The term feed

forward is also used when you input

something at the input layer and it

travels from the input to the hidden and

from the hidden to the output layer. The

values are fed forward. Now brack

propagation is a training algorithm

which consists of two steps majorly. The

first one is feed forwarding the values

and the second one is to calculate the

error and propagate it back to the

earlier layers. So to be precise forward

propagation is a part of back

propagation algorithm but it comes

before the back propagation.

So one might ask the question which is

one of the most important questions is

that what are the hyperparameters in a

neural networks and name a few of these

hyperparameters.

So hyperparameters are the variables

which determine the network structure

that is for example the number of hidden

units and or the hidden layers and the

variables which determine how the

network is trained for example the

learning rate. Now there are two types

of hyperparameters usually one are the

network parameters which are associated

to the network. In that case we have the

number of layers we have the network

weight initialization. we have the

activation function and in the training

parameters we have the learning rate, we

have momentum, number of epochs, we have

the batch size and much more. Now a lot

of hyperparameters also differ when we

work along with different types of

neural networks. So as in CNN we get

extra parameters to work on when

considering CNN which are the

convolutional neural networks and

sometimes we have to deal with less

number of hyperparameters. It all

depends upon the type of neural network

which you are using. So uh which brings

us to the next question is that explain

the different hyperparameters related to

networking and training. So in training

we have first of all we have the number

of hidden layers. So hidden layers are

the layers between the input and the

output layers as we just discussed and

many hidden units within a layer with

regularization technique can increase

the accuracy as smaller number of units

may cause underfitting. Now another

important aspect is network weight

initialization. So ideally it may be

better to use different weight

initialization schemes according to the

activation function used on each layer.

Mostly uniform distribution is used or

the normal distribution. Now if we talk

about activation function so they are

also used to introduce nonlinearity to

the models. They're also used to

introduce nonlinearity to the models

which allows deep learning models to

learn nonlinear prediction boundaries.

Now generally the rectifier activation

function or the relu is the most

popular. Now if you talk about the

training parameters. So these were the

network parameters which have to be

initialized to a deep neural network

before the training begins. And just

before the training we have the training

parameters which are the learning rate.

So the learning rate defines how quickly

a network updates its parameter. Low

learning rate slows down the learning

process but converts smoothly. A larger

learning rate speeds up the learning but

may not converge as smooth as a low

learning rate. Usually a decaying

learning rate is preferred so that we

get the best of both worlds and we get

the best expected output. Now another

hyperparameter is momentum. So momentum

helps us to know the direction of the

next step with the knowledge of the

previous step. Now it helps to prevent

oscillation and a typical choice of

momentum is between 0.5 to 0.9. Now if

we talk about the number of epochs. So

epoch is basically iteration. So number

of epochs is the number of times the

whole training data is shown to the

network while training. So increase the

number of epochs until the validation

accuracy starts decreasing even when

tearing accuracy is increasing. So that

results in sometimes overfitting. And if

we talk about the batch size, so mini

batch size is the number of subsamples

given to the network after which

parameters update happen. So a good

default for batch size might be 32 or 16

64. It depends upon the size of you know

the data you have. It can be any

arbitrary number but it's always better

to have it in the power of two right. So

while we were talking about overfitting

which brings us to our next question

which is what exactly is a dropout. So

dropout is a regularization technique to

avoid overfitting which is to increase

the validation accuracy thus increasing

the generalization power. Now generally

use a small dropward value of 20% to 50%

of the neurons with 20% providing a good

starting point and a probability too low

has minimal effect and a value too high

results in underarning by the network.

So first of all what you need to do is

use a large network and you are likely

to get better performance when the

dropout is used on larger network giving

the model more of an opportunity to

learn independent representation. Now

our next question is in a neural network

you notice that the loss does not

decrease in the few starting epochs. So

what could be the possible reason for

this to happen? Now the correct answer

is the reason for this could be the

learning rate is low first of all or it

might be the regularization parameter is

high or it can be it is stuck at local

minima. So it might take certain

iteration to go out of that local minima

and finally reach the lowest point. So

it might happen in some cases that it is

stuck at local minima. So another

approach to that sort of problem must be

initiated at that particular point of

time. Now talking about deep learning,

one might ask to name you a few deep

learning frameworks which are being used

in the industry. So first of all the

foremost and the most amazing deep

learning library is the tensorflow.

Followed by we have cafe. We have the

Microsoft cognitive toolkit which is the

CNTK. We have Torch or PyTorch which is

giving a good battle or it's standing

out from the crowd and people are

sometimes preferring PyTorch over

TensorFlow. Now MXNet is another deep

learning framework. We have Chainer and

we have KAS. Now, Kiras as you know can

be integrated with Theano as well as

TensorFlow and KAS has been considered

one of the best or the simplest deep

learning framework when it comes to deep

learning. Now, one might ask what

exactly are tensors? So, tensors are

nothing but a de factor for representing

the data in deep learning. What I meant

to say that tensors are just

multi-dimensional arrays that allows you

to represent the data having higher

dimensions. In general deep learning you

deal with highdimensional data sets

where dimensional refer to the different

features present in the data set. So

what you need is a multi-dimensional

sort of array or a data structure what

you could say. So that's what exactly

tensor is and in fact the name

tensorflow has been derived from the

operations which the neural network

perform on tensor. So it's literally a

flow of tensor. Now talking about

TensorFlow one might ask since it's the

most popular deep learning framework and

companies prefer people having the

knowledge of TensorFlow and been working

on it. So what are the few advantages of

TensorFlow? So first of all it has the

platform flexibility.

It is easily trainable on CPU as well as

GPU for distributed computing. Now,

TensorFlow has auto differentiation

capabilities and it has advanced support

for threads, asynchronous computation

and it is a customizable and open-source

framework. And most importantly, if we

talk about the latest TensorFlow 2.0 O

which has just been released. So those

come up with a lot of interesting

features and it has adopted kas as its

highle API fully so that the coding

aspect of it is much simplified and

eager execution is now by default so

that you do not have to write loads and

loads of line of code and if you want to

know more about tensorflow 2.0 know and

why it's the best deep learning

framework in the industry right now.

Just go ahead and check our TensorFlow

2.0 video. I'll leave the link in the

description box below. Go check it out

guys and understand how exactly is it

better from the previous version and why

it is the best deep learning framework

right now. Now talking about

computational graphs, one might ask what

exactly they are. So well a

computational graph is a series of

tensorflow operations arranged as nodes

in the graph. Now each node takes zero

or more tensors as input and produces a

tensor as output. Now basically one can

think of a computational graph as an

alternative way of conceptualizing

mathematical calculation that take place

in a tensorflow program. Now the

operations assigned to the different

nodes of a computational graph can be

performed in parallel thus providing

better performance in terms of

computation. So one might ask what

exactly is a convolution neural network.

Now a convolution neural network or CNN

or connet is a class of deep learning

neural networks which is most commonly

applied to analyzing the visual imagery.

So CNN use a variation of the

multi-layer proceptron uh designed to

require minimal processing. Now one

might ask the next question if you are

going for an interview which requires

you to work with a lot of images or

videos. So in that case CNN's are very

much used. So having a good knowledge of

CNN is always better in that case. So

the next question what we have here is

what are the various layers of CNN? Now

there are four layered concepts everyone

should understand in convolutional

neural networks are first the

convolutional layer the second is the

relu layer and finally we have the

pooling layer and finally we end up with

the full connectedness or the full

connected layer. Now if we talk about

CNN we have to talk about RNN also. So

one might ask what exactly is RNN? So

RNN or the recurren networks are a type

of artificial neural networks which are

designed to recognize the patterns in

the sequence of data such as text

genomes handwriting the spoken word

numerical time series data from sensors

the stock markets and the government

agencies. So recurrent neural networks

use back propagation algorithm for

training but it is applied for every

time stamp. It is commonly known as back

propagation through time which is BTT.

Now our next question is what are some

issues faced while training an RNN? So

recurrent neural networks use back

propagation algorithm as I just

mentioned for training but it is applied

for every time stamp and there are some

issues with back propagation such as

vanishing gradient or the exploring

gradient where the gradient vanishes or

it is too much to handle which brings us

to the next set of questions. The first

of which is what exactly is a vanishing

gradient and how is it harmful? Now when

we do back propagation that is move

backward in the network and calculating

gradients of loss which is the error

with respect to the weights the

gradients tend to get smaller and

smaller as we keep on moving backward in

the network. Now this means that the

neurons in the earlier layers learn very

slowly as compared to the neurons in the

later layers in the hierarchy. Now the

earlier layers in the networks are the

slowest to train. Now how is this

harmful? So earlier layers in the neural

networks are important because they are

responsible to learn and detect the

simple patterns and are actually the

building blocks of our neural network.

Obviously if they give improper and

inaccurate result then how can we expect

the next layer and the complete network

to perform nicely and produce the

accurate result. So the training process

takes too long and the prediction

accuracy of the model will decrease. Now

another question here arises is what

exactly is then exploding gradient

descent. Now this is just the opposite

of vanishing gradient descent. So

exploding gradients are a problem when

large error gradients accumulate and

result in very large updates to the

neural network model weights during

training. So the gradients are used

during the training to update the

network weights. But typically when this

process works best is when this weights

are small and controlled when the

magnitudes of the gradients accumulate

and the unstable network is likely to

occur. Now which causes a poor

prediction and results or even a model

that reports nothing useful whatsoever.

So vanishing gradient and the exploding

gradient are two problems which occur

while the back propagation happens in a

recurrent neural network. So our next

question is what are LSTM? So long

short-term memory which are the LSTM is

an artificial recurrent neural network

architecture used in the field of deep

learning and unlike standard feed

forward neural networks the LSTM has

feedback connection that make it a

generalpurpose computer. Now it can not

only process single data points but also

the entire sequence of data. They are a

special kind of RNN or the recurren

neuron network which are capable of

learning long-term dependencies.

Now one might ask what are capsules in a

capsule neural network. So capsules are

vector or what we can say an element

with a size and a direction specifying

the features of the object and its

likelihood. Now these features can be

any of the instantiation parameters like

the pose. We have the position, size,

orientation deformationation velocity

the albido which is the light

reflection, hue, texture and much more.

A capsule can also specify its

attributes like angle and size. So it

can represent with the same genic

information. Now just like a neural

network has layers of neurons, a capsule

network can have layers capsules. So

there could be higher capsules

representing the group of objects or the

capsules below them. Now this helps in

getting deeper knowledge of a particular

object or a particular data set and

having the knowledge from different

aspects or different angles. So the next

question arises is explain autoenccoders

and its uses. So an autoenccoder neural

networks is an unsupervised machine

learning algorithm that applies the back

propagation setting the target values to

be equal to the inputs. So autoenccoders

are used to reduce the size of our

inputs into smaller representation and

if anyone needs the original data they

can reconstruct it from the compressed

data. Now one might ask the question how

does autoenccoder differ from PCA? So an

autoenccoder can learn from nonlinear

transformation with a nonlinear

activation function and multiple layer.

It does not have to learn tense layers.

It can use convolution layers to learn

which is better for video, image and

series data. It is more efficient to

learn several layers with an

autoenccoder rather than learn one huge

transformation with the PCA. An

autoenccoder provides a representation

of each layer as the output and can take

the use of pre-trained layers from other

model to apply transfer the learning to

enhance the encoder or the decode. So

these are few of the reasons why

autoenccoders are better from PCA as we

know both of them perform the same task

which is mostly dimensionality

reduction. Now give some real life

examples where autoenccoders can be

applied. So the first of all we talk

about dimensionality reduction or the

first thing that should pop up in your

mind is dimensionality reduction. So the

recrossected image is the same as our

input image but with reduced dimensions.

Now it helps in providing similar image

with reduced pixel value and it can be

used in various areas where we have

limited storage or we have limited

processing power. So when there is a

high input or an image or a data with

high dimension or which has higher

values pixel values it can compress and

provide the same image with a lower

pixel value. Right? Or colors are used

for converting any black and white

picture into a colored image. Believe it

or not and depending on what is in the

picture it is possible to tell what the

color should be. Now feature variation.

If we talk about feature variation, it

attracts only the required features of

an image and generates the output by

removing any unnecessary noise or

unnecessary interruption. And if we talk

about dnoising image, the input seen by

an autoenccoder is not the raw input but

a stochastically corrupted version. A

dinoising autoenccoder is thus train to

reconstruct the original input from the

noisy version. Now talking about

autoenccoders, one might ask about the

different layers of the autoenccoders.

So basically an autoenccoder consists of

three layers which is the encoder, we

have the code and the decoder which

brings us to the next question explain

the architecture of an autoenccoder. If

you talk about the three layers which

are encoder, code and decoder. So if we

talk about encoder this part of the

network compresses the input into a

latent space representation. Now the

encoder layer encodes the input images

as a compressed representation in a

reduced dimension and the compressed

image is the distorted version of the

original image. Now coming to the middle

part which is the code. So this part of

the network represents the compressed

input which is fed to the decoder is

basically the channel. And if you talk

about decoder, this layer decodes the

encoded image back into the original

dimension. And a decoded image is a

lossy reconstruction of the original

image and it's reconstructed from the

latent space representation. Now one

might ask what exactly is bottleneck in

an autoenccoder and why is it used. Now

the layer between the encoder and the

decoder that is the code is also known

as bottleneck. So this is a

well-designed approach to decide which

aspect of the observed data are relevant

information and what aspects can be

discarded. It does this by balancing two

criterias. The first the compactness of

the representation measured as the

compressibility and second it retains

some behaviorally relevant variables

from the input. Now one might ask are

there any variation of autoenccoders?

Surely there are. So there are

conventional autoenccoders, we have

sparse autoenccoders, we have deep

autoenccoders, we have contractive

autoenccoders.

All of these autoenccoders have a

different structure or the different

code layer. If you talk about the

convolutional autoenccoder, we have the

convolutional CNN algorithm sort of

structure in that particular

autoenccoder with encoder in one side.

We have the convolution layers, the ReLU

layer, the pooling layer inside it and

then finally we have the decoding layer.

So another question what might pop into

the interviewer's mind is what are deep

autoenccoders? So the extension of

simple autoenccoders is a deep

autoenccoders. The first layer of the

deep autoenccoders is used for first

order features in the raw input. Now the

second layer is used for second order

features corresponding to the patterns

in the appearance of the first order

features. So the deeper layers of the

deep autoenccoders tend to learn even

higher order features. So a deep

autoenccoders is composed of two

symmetrical deep belief networks. first

four or five shallow layers representing

the encoding half of the net and the

second set of four or five layers that

make up the decoding half. Interesting,

right? So another important topic in

deep learning are the restricted bolts

in machines. So one might ask what

exactly is an RBM or restricted W

machine. So, RBM is an undirected

graphical model that plays a major role

in deep learning framework in recent

times and it is an algorithm which is

used for dimensionality reduction. Not

only that, it is used for

classification regression

collaborating filtering feature

learning and topic model. So when we

talk about RBM being useful for

dimensionality reduction, another

question might arise is how does RBM

differ from the autoenccoders? So

autoenccoders is a simple three-layer

neural network where output units are

directly connected back to the input

units. Typically the number of hidden

units is much less than the number of

visible ones and the task of training is

to minimize an error or the

reconstruction that is find the most

efficient compact representation for the

input data. So RBM share a similar idea

but it uses stochastic units with

particular distribution instead of

deterministic distribution. The task of

our training is to find out how these

two set of variables are actually

connected to each other. One aspect that

distinguishes RBM from the autoenccoders

is that it has two biases. The hidden

bias helps the RBM produce the

activations on the forward pass while

the visible layer biases help the RBM

learn the reconstruction on the backward

pass. Now this brings us to the final

questions of our deep learning interview

is that what are some limitations of

deep learning? I bet you weren't

thinking of this one but there are some

limitations. So deep learning usually

requires large amounts of training data

and deep neural networks are easily

fooled. Now the success of deep learning

are purely empirial. Deep learning

algorithms have been criticized as

uninterpretable black boxes because one

important thing about deep learning is

that you do not specify what you are

looking for. Right? The algorithm learns

on its own. So that is one of the

shortcomings of deep learning and deep

learning thus far has not been well

integrated with prior knowledge. So a

lot of people still don't feel it as a

way to solve their problem as a way to

approach to their problems because a lot

of people don't understand what exactly

is deep learning how it works how to

initialize all of the variables which

are the hyperparameters per se. These

all things are some limitations of deep

learning as of now. And we hope by the

time technology advances people get to

know more about what deep learning is,

how artificial intelligence can be

achieved through deep learning. They'll

be more open to this and all of these

limitations will be laid off. So guys uh

that's it from my side and I hope you

got to know a lot about deep learning

interview questions which might help you

in cracking the interviews and landing a

great job as data scientists, machine

learning engineers or artificial

intelligence engineers as a matter of

fact and one important thing what I

would like to say is that data scientist

role are somewhat you know industry

specific or I would say if you are

working in healthcare you should know

about healthcare industry too rather

than just knowing about the datas and

the numbers. So if you're working in

suppose imagery so you should know about

images you should know what you're

dealing with. So a good knowledge of the

particular industry which you're working

for will also provide you a great

advantage over other competitors and

since you know a lot of these stuff with

this video I'm sure you might be able to

land a job a great job in any of these

industries. And with this we have come

to an end to this agentic AI course for

beginners. If you enjoyed listening to

this course please be kind enough to

like it and you can comment on any of

your doubts and queries. We will reply

to them at the earliest and do look out

for more videos and playlist and

subscribe to Idurka's YouTube channel to

learn more. Thank you for watching and

happy learning.

Agentic AI Course For Beginners 2026 | AI Agents Tutorial | Agentic AI Course | Edureka

edureka!

60 days ago

7:32:41

Agentic AI Systems

Rank #4

Description

🔥Agentic AI Training Course - Master AI Agents: https://www.edureka.co/agentic-ai-training-course Explore the future of artificial intelligence with Agentic AI! In this video, we dive into the exciting developments and advancements that are set to shape the industry in 2025. We will learn what Agentic AI is and how it goes beyond traditional AI by acting with purpose and autonomy. You’ll discover 5 powerful secrets behind Agentic AI—from multi-agent collaboration to real-world tool integration and ethical guardrails. By the end, you’ll understand how Agentic AI is transforming industries and why it’s the future of intelligent systems. Join us as we discuss the latest trends, innovations, and predictions for Agentic AI in 2025. Whether you're an AI enthusiast, a business leader, or simply curious about the future of technology, this video is for you. Stay ahead of the curve and discover what's next for Agentic AI! 00:01:28 What is Agentic AI? 00:12:20 Agentic AI vs Generative AI 00:33:33 Alexa+ Powered by Generative AI 00:44:47 Agentic AI Roadmap 00:51:15 Introduction to Artificial Intelligence 01:10:03 Introduction to Deep Learning 02:27:26 Artificial Neural Network 02:59:26 Transformers Explained Using Generative AI 03:07:05 Transformers Neural Networks Explained 03:14:29 What are Large Language Models? 03:35:28 What is Multimodal AI? 03:52:46 LLM vs SLM 03:58:14 What is LangChain? 04:15:29 Langchain Agents Explained 04:22:44 What is RAG? 04:45:32 LLMOps: The Future of AI Development 04:51:55 Prompt Engineering 05:06:09 Natural Language Processing (NLP) & Text Mining using NLTK 05:44:52 KNN Algorithm using Python 06:03:11 Alibaba’s Qwen 2.5-Max Just Beat GPT-4 & DeepSeek? 06:07:35 DeepSeek Training Cost: How China Built AI for Less 06:13:04 DeepSeek vs OpenAI: Who Wins the AI Race? 06:25:36 Deep Learning Interview Questions ✅Subscribe to our channel to get video updates. Hit the subscribe button above: https://goo.gl/6ohpTV 📝Feel free to share your comments below.📝 𝐄𝐝𝐮𝐫𝐞𝐤𝐚 𝐎𝐧𝐥𝐢𝐧𝐞 𝐓𝐫𝐚𝐢𝐧𝐢𝐧𝐠 𝐚𝐧𝐝 𝐂𝐞𝐫𝐭𝐢𝐟𝐢𝐜𝐚𝐭𝐢𝐨𝐧𝐬 🔵 DevOps Online Training: http://bit.ly/3VkBRUT 🌕 AWS Online Training: http://bit.ly/3ADYwDY 🔵 React Online Training: http://bit.ly/3Vc4yDw 🌕 Tableau Online Training: http://bit.ly/3guTe6J 🔵 Power BI Online Training: http://bit.ly/3VntjMY 🌕 Selenium Online Training: http://bit.ly/3EVDtis 🔵 PMP Online Training: http://bit.ly/3XugO44 🌕 Salesforce Online Training: http://bit.ly/3OsAXDH 🔵 Cybersecurity Online Training: http://bit.ly/3tXgw8t 🌕 Java Online Training: http://bit.ly/3tRxghg 🔵 Big Data Online Training: http://bit.ly/3EvUqP5 🌕 RPA Online Training: http://bit.ly/3GFHKYB 🔵 Python Online Training: http://bit.ly/3Oubt8M 🌕 Azure Online Training: http://bit.ly/3i4P85F 🔴 𝐄𝐝𝐮𝐫𝐞𝐤𝐚 𝐑𝐨𝐥𝐞-𝐁𝐚𝐬𝐞𝐝 𝐂𝐨𝐮𝐫𝐬𝐞𝐬 🔵 DevOps Engineer Masters Program: http://bit.ly/3Oud9PC 🌕 Cloud Architect Masters Program: http://bit.ly/3OvueZy 🔵 Data Scientist Masters Program: http://bit.ly/3tUAOiT 🌕 Big Data Architect Masters Program: http://bit.ly/3tTWT0V 🔵 Machine Learning Engineer Masters Program: http://bit.ly/3AEq4c4 🌕 Business Intelligence Masters Program: http://bit.ly/3UZPqJz 🔵 Python Developer Masters Program: http://bit.ly/3EV6kDv 🔴 𝐄𝐝𝐮𝐫𝐞𝐤𝐚 𝐔𝐧𝐢𝐯𝐞𝐫𝐬𝐢𝐭𝐲 𝐏𝐫𝐨𝐠𝐫𝐚𝐦𝐬 🔵 Post Graduate Program in DevOps with Purdue University: https://bit.ly/3Ov52lT 🌕 Advanced Certificate Program in Data Science with E&ICT Academy, IIT Guwahati: http://bit.ly/3V7ffrh 🔵 Advanced Certificate Program in Cloud Computing with E&ICT Academy, IIT Guwahati: https://bit.ly/43vmME8 📌𝐓𝐞𝐥𝐞𝐠𝐫𝐚𝐦: https://t.me/edurekaupdates 📌𝐓𝐰𝐢𝐭𝐭𝐞𝐫: https://twitter.com/edurekain 📌𝐋𝐢𝐧𝐤𝐞𝐝𝐈𝐧: https://www.linkedin.com/company/edureka 📌𝐈𝐧𝐬𝐭𝐚𝐠𝐫𝐚𝐦: https://www.instagram.com/edureka_learning/ 📌𝐅𝐚𝐜𝐞𝐛𝐨𝐨𝐤: https://www.facebook.com/edurekaIN/ 📌𝐒𝐥𝐢𝐝𝐞𝐒𝐡𝐚𝐫𝐞: https://www.slideshare.net/EdurekaIN 📌𝐂𝐚𝐬𝐭𝐛𝐨𝐱: https://castbox.fm/networks/505?country=IN 📌𝐌𝐞𝐞𝐭𝐮𝐩: https://www.meetup.com/edureka/ 📌𝐂𝐨𝐦𝐦𝐮𝐧𝐢𝐭𝐲: https://www.edureka.co/community/ - - - - - - - - - - - - - - What is Agentic AI? Agentic AI refers to artificial intelligence systems that can autonomously make decisions, take actions, and pursue goals with minimal human intervention. Unlike traditional AI, which adheres to predetermined rules, agentic AI may dynamically adapt to new situations. It is widely utilized in robotics, virtual assistants, self-driving cars, and sophisticated decision-making processes. What are the prerequisites for this Agentic AI Training Course? In order to complete this course successfully, participants need to have a basic understanding of the Python programming language, machine learning, deep learning, natural language processing, generative AI, and prompt engineering concepts. For more information, please write back to us at sales@edureka.in or call us at IND: 9606058406 / US: +18885487823(toll-free). #agenticai #aiagents #agenticaicourse #generativeai

Watch on YouTube