Rafay Khan
1 min readSep 22, 2020

--

@mistertandon Thanks a lot for reading my post, I'm glad you found it useful.

Now coming to your question, in the vectorized implementation we are indeed taking the average of gradients, just like in the pseudo-code in figure 39. I think you may have overlooked figure 50 while perusing the post, it shows how the derivatives end up being the average because of our use of the Cost Function.

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

--

--

Rafay Khan
Rafay Khan

No responses yet

Write a response