Is my back propagation math correct?
$begingroup$
I have been working on programming a feed forward neural network that uses stochastic gradient descent and I am still a little confused on all the calculus. To make sure I have the math correct, I am using the following network as an example: Neural network image
The equation for the output of the network should be the following I believe ( s(x) is sigmoid ):
$$
sleft(xright) = frac{1}{1+e^{-x}}
$$
$$
O=sleft(sleft(sleft(Iw_1+1right)w_2+1right)w_4+sleft(sleft(Iw_1+1right)w_3+1right)w_5+1right)
$$
When I did the math to get the derivative of O with respect to w1, I got: ( d(x) is the derivative of sigmoid )
$$
dleft(xright)=sleft(xright)left(1-sleft(xright)right)
$$
$$
dleft(sleft(sleft(Iw_1+1right)w_2+1right)w_4+sleft(sleft(Iw_1+1right)w_3+1right)w_5+1right)cdotleft(left(dleft(sleft(Iw_1+1right)w_2+1right)cdot dleft(Iw_1+1right)cdot Iright)+left(dleft(sleft(Iw_1+1right)w_3+1right)cdot dleft(Iw_1+1right)cdot Iright)right)
$$
I used https://www.desmos.com/calculator to see if it got the same answer for the derivative of O with respect to w1. With all the weights set to 0.5, all the biases set to 1, and the input set to 1, desmos said that the derivative was 0.0014291881022. But my equation gave 0.00571675240882. Is there a mistake somewhere in my math or is some weird thing desmos does? Sorry if I did anything simple wrong or messed up the notation. I am still very new to calculus.
calculus partial-derivative neural-networks
$endgroup$
add a comment |
$begingroup$
I have been working on programming a feed forward neural network that uses stochastic gradient descent and I am still a little confused on all the calculus. To make sure I have the math correct, I am using the following network as an example: Neural network image
The equation for the output of the network should be the following I believe ( s(x) is sigmoid ):
$$
sleft(xright) = frac{1}{1+e^{-x}}
$$
$$
O=sleft(sleft(sleft(Iw_1+1right)w_2+1right)w_4+sleft(sleft(Iw_1+1right)w_3+1right)w_5+1right)
$$
When I did the math to get the derivative of O with respect to w1, I got: ( d(x) is the derivative of sigmoid )
$$
dleft(xright)=sleft(xright)left(1-sleft(xright)right)
$$
$$
dleft(sleft(sleft(Iw_1+1right)w_2+1right)w_4+sleft(sleft(Iw_1+1right)w_3+1right)w_5+1right)cdotleft(left(dleft(sleft(Iw_1+1right)w_2+1right)cdot dleft(Iw_1+1right)cdot Iright)+left(dleft(sleft(Iw_1+1right)w_3+1right)cdot dleft(Iw_1+1right)cdot Iright)right)
$$
I used https://www.desmos.com/calculator to see if it got the same answer for the derivative of O with respect to w1. With all the weights set to 0.5, all the biases set to 1, and the input set to 1, desmos said that the derivative was 0.0014291881022. But my equation gave 0.00571675240882. Is there a mistake somewhere in my math or is some weird thing desmos does? Sorry if I did anything simple wrong or messed up the notation. I am still very new to calculus.
calculus partial-derivative neural-networks
$endgroup$
add a comment |
$begingroup$
I have been working on programming a feed forward neural network that uses stochastic gradient descent and I am still a little confused on all the calculus. To make sure I have the math correct, I am using the following network as an example: Neural network image
The equation for the output of the network should be the following I believe ( s(x) is sigmoid ):
$$
sleft(xright) = frac{1}{1+e^{-x}}
$$
$$
O=sleft(sleft(sleft(Iw_1+1right)w_2+1right)w_4+sleft(sleft(Iw_1+1right)w_3+1right)w_5+1right)
$$
When I did the math to get the derivative of O with respect to w1, I got: ( d(x) is the derivative of sigmoid )
$$
dleft(xright)=sleft(xright)left(1-sleft(xright)right)
$$
$$
dleft(sleft(sleft(Iw_1+1right)w_2+1right)w_4+sleft(sleft(Iw_1+1right)w_3+1right)w_5+1right)cdotleft(left(dleft(sleft(Iw_1+1right)w_2+1right)cdot dleft(Iw_1+1right)cdot Iright)+left(dleft(sleft(Iw_1+1right)w_3+1right)cdot dleft(Iw_1+1right)cdot Iright)right)
$$
I used https://www.desmos.com/calculator to see if it got the same answer for the derivative of O with respect to w1. With all the weights set to 0.5, all the biases set to 1, and the input set to 1, desmos said that the derivative was 0.0014291881022. But my equation gave 0.00571675240882. Is there a mistake somewhere in my math or is some weird thing desmos does? Sorry if I did anything simple wrong or messed up the notation. I am still very new to calculus.
calculus partial-derivative neural-networks
$endgroup$
I have been working on programming a feed forward neural network that uses stochastic gradient descent and I am still a little confused on all the calculus. To make sure I have the math correct, I am using the following network as an example: Neural network image
The equation for the output of the network should be the following I believe ( s(x) is sigmoid ):
$$
sleft(xright) = frac{1}{1+e^{-x}}
$$
$$
O=sleft(sleft(sleft(Iw_1+1right)w_2+1right)w_4+sleft(sleft(Iw_1+1right)w_3+1right)w_5+1right)
$$
When I did the math to get the derivative of O with respect to w1, I got: ( d(x) is the derivative of sigmoid )
$$
dleft(xright)=sleft(xright)left(1-sleft(xright)right)
$$
$$
dleft(sleft(sleft(Iw_1+1right)w_2+1right)w_4+sleft(sleft(Iw_1+1right)w_3+1right)w_5+1right)cdotleft(left(dleft(sleft(Iw_1+1right)w_2+1right)cdot dleft(Iw_1+1right)cdot Iright)+left(dleft(sleft(Iw_1+1right)w_3+1right)cdot dleft(Iw_1+1right)cdot Iright)right)
$$
I used https://www.desmos.com/calculator to see if it got the same answer for the derivative of O with respect to w1. With all the weights set to 0.5, all the biases set to 1, and the input set to 1, desmos said that the derivative was 0.0014291881022. But my equation gave 0.00571675240882. Is there a mistake somewhere in my math or is some weird thing desmos does? Sorry if I did anything simple wrong or messed up the notation. I am still very new to calculus.
calculus partial-derivative neural-networks
calculus partial-derivative neural-networks
asked Dec 21 '18 at 5:10
That_one_guyThat_one_guy
132
132
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
$begingroup$
Looks like you're missing some components in the second term. The full expression should be
$$
frac{partial O}{partial w_1} =d(s(s(Iw_1+1)w_2+1)w_4+s(s(Iw_1+1)w_3+1)w_5+1)⋅[(mathbf{w_4}d(s(Iw_1+1)w_2+1)⋅mathbf{w_2}d(Iw_1+1)⋅I)+(mathbf{w_5}d(s(Iw_1+1)w_3+1)⋅mathbf{w_3}d(Iw_1+1)⋅I)],
$$
with the missing terms in bold.
Since you set all the weights to 0.5 in your check, you'll see that the missing terms $w_4w_2$ and $w_5w_3$ will both multiply to 0.25.
And indeed, your answer is off by a factor of 0.25
$endgroup$
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "69"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3048198%2fis-my-back-propagation-math-correct%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
Looks like you're missing some components in the second term. The full expression should be
$$
frac{partial O}{partial w_1} =d(s(s(Iw_1+1)w_2+1)w_4+s(s(Iw_1+1)w_3+1)w_5+1)⋅[(mathbf{w_4}d(s(Iw_1+1)w_2+1)⋅mathbf{w_2}d(Iw_1+1)⋅I)+(mathbf{w_5}d(s(Iw_1+1)w_3+1)⋅mathbf{w_3}d(Iw_1+1)⋅I)],
$$
with the missing terms in bold.
Since you set all the weights to 0.5 in your check, you'll see that the missing terms $w_4w_2$ and $w_5w_3$ will both multiply to 0.25.
And indeed, your answer is off by a factor of 0.25
$endgroup$
add a comment |
$begingroup$
Looks like you're missing some components in the second term. The full expression should be
$$
frac{partial O}{partial w_1} =d(s(s(Iw_1+1)w_2+1)w_4+s(s(Iw_1+1)w_3+1)w_5+1)⋅[(mathbf{w_4}d(s(Iw_1+1)w_2+1)⋅mathbf{w_2}d(Iw_1+1)⋅I)+(mathbf{w_5}d(s(Iw_1+1)w_3+1)⋅mathbf{w_3}d(Iw_1+1)⋅I)],
$$
with the missing terms in bold.
Since you set all the weights to 0.5 in your check, you'll see that the missing terms $w_4w_2$ and $w_5w_3$ will both multiply to 0.25.
And indeed, your answer is off by a factor of 0.25
$endgroup$
add a comment |
$begingroup$
Looks like you're missing some components in the second term. The full expression should be
$$
frac{partial O}{partial w_1} =d(s(s(Iw_1+1)w_2+1)w_4+s(s(Iw_1+1)w_3+1)w_5+1)⋅[(mathbf{w_4}d(s(Iw_1+1)w_2+1)⋅mathbf{w_2}d(Iw_1+1)⋅I)+(mathbf{w_5}d(s(Iw_1+1)w_3+1)⋅mathbf{w_3}d(Iw_1+1)⋅I)],
$$
with the missing terms in bold.
Since you set all the weights to 0.5 in your check, you'll see that the missing terms $w_4w_2$ and $w_5w_3$ will both multiply to 0.25.
And indeed, your answer is off by a factor of 0.25
$endgroup$
Looks like you're missing some components in the second term. The full expression should be
$$
frac{partial O}{partial w_1} =d(s(s(Iw_1+1)w_2+1)w_4+s(s(Iw_1+1)w_3+1)w_5+1)⋅[(mathbf{w_4}d(s(Iw_1+1)w_2+1)⋅mathbf{w_2}d(Iw_1+1)⋅I)+(mathbf{w_5}d(s(Iw_1+1)w_3+1)⋅mathbf{w_3}d(Iw_1+1)⋅I)],
$$
with the missing terms in bold.
Since you set all the weights to 0.5 in your check, you'll see that the missing terms $w_4w_2$ and $w_5w_3$ will both multiply to 0.25.
And indeed, your answer is off by a factor of 0.25
answered Dec 21 '18 at 5:42
Ben LansdellBen Lansdell
1388
1388
add a comment |
add a comment |
Thanks for contributing an answer to Mathematics Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3048198%2fis-my-back-propagation-math-correct%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown