# Parameter scaling in nonlinear optimization

$\def\R{\mathbb R} \def\F{\mathbf F}$

You know the scenario: you have a function $f:\R^n \rightarrow \R$ and a starting guess $x_0 \in \R^n$. You want to find a better $x$, ideally the one for which $f(x)$ is smallest, which you might call $x^\ast$.

(I'll assume $f(x) = \|\F(x)\|^2 = \sum_{i=1}^m F_i(x)^2$ for $\F:\R^n\rightarrow\R^m$ so I can talk about the Jacobian $J = \frac{\partial\F}{\partial x}$, but if your problem doesn't look like that, just let $m=1$, or read Jacobian as gradient.)

Many nonlinear optimization packages aim to be scaling-covariant, that is they promise to give you an equivalent answer even if you scale x with a diagonal matrix D. That is, if

xstar = minimize(f, x0)

and you define (Julia syntax)

g(x) = f(D*x)
ystar = minimize(g, inv(D)*x0)

then they promise that

ystar = inv(D)*xstar

This is a good thing, right?

Well, sure it's a good thing. But it's generally implemented badly. In fact, it's basically not really possible to do it right. Most attempts fail if your starting point is at the origin, which may well be a very reasonable starting point, or if one column of the Jacobian is zero at the starting point, again this may well be reasonable. 1)

So instead I far prefer to place some onus on you, the user of the routine. Your job is this:

Scale x so its components are “around 1”

This is actually pretty easy in practice. Just think about the parameters in your model: maybe they are spring constants, initial temperatures, robot joint angles, phase of the moon. Think about the output of your model. Maybe it's predicted energy use in watts. Think about the real-world application. Is it energy use of a city? Then use megawatts or gigawatts to keep the objective function value around 1. Actually, this is the easy bit – almost all implementations deal well with scaling the objective, but I want to get you into the regime of thinking about scaling.

Then, to scale the parameters, ensure that when they change by about 1e-5, the objective also changes by a reasonable amount. So if one of the parameters is phase of the moon, it might feel natural to measure it in days (0 < phoon < 29.5) with some handling for wraparound. Then your labmate points out that the SI unit for time is the second so you dutifully reparameterize it in seconds, in the range (0, 2500000) But if you add 1e-5 seconds, it will probably change the objective by very little, maybe by 1e-12 gigawatts, and the optimizer will be in trouble.

Your first idea (do it in days) was actually a better choice (about 86400 times better), giving a change of 1e-7 GW when you add 1e-5 days (about a second). Can we do even better?

Well, look at how the parameter appears in the objective funciton. For phase of the moon, that's normally in some term like:

$$\cos(\eta * \mathtt{phoon})^\beta$$

so a sensible range might be 0 to $2\pi/\eta$. The point is just to think about how a change in phoon affects the objective.

#### But that's the easy case, what about dimensionless parameters?

Yeah, not an issue. Imagine that beta above is also part of the optimization. 2) Again, we perform a little thought experiment to see what a reasonable range might be. Below 1 it will flatten out the curve, that could be useful to model some DC effect, but might also cause phoon to be conflated with some other parameter, and ultimately will make the function quite insensitive to phoon…. Hmmm… Recheck thoughts above.. OK, let's keep beta bigger than 0.5, and smaller than say 10. Quick mental check if adding 1e-5 at either end of that range will do anything very weird. Seems OK.

$\def\e#1#2{{\mathbf e}^{#1}_{#2}}$

#### Another reason for good scaling: finite difference Jacobian calculation

In many situations, you aren't yet sure enough of your model function $\F(x)$ to have written code to compute the Jacobian. Or, you have written that code, but you need to test it. (Believe me, you really do.) Either way, you need another way to get the Jacobian $J$. It's well known that you can get a decent approximation to $J$ using finite differences as follows.

Let's look at one entry: $J_{ij} = \frac{\partial F_i}{\partial x_j}$, and approximate it using the limit definition. The vector $\e nj$ is the unit vector in $\R^n$ with a one in the $j^\text{th}$ element and zeros elsewhere, and $F_i(x)$ is the $i^\text{th}$ component of $\F(x)$.

$$\frac{\partial F_i}{\partial x_j}(x) = \lim_{\delta \rightarrow 0} \frac{F_i(x + \delta \e n j) - F_i(x)}{\delta}$$

In a computer, we can't evaluate the limit, so we just choose some “small” value for $\delta$ and implement the formula in code, building a column at a time:

  for j = 1:n
J[:,j] = (F(x + delta * e(m,j)) - F(x))/delta

This can be made more efficient, and more accurate (central differences, Ridders's method etc), but the point I mainly want to explore here is the choice of delta, and argue that really, you should set things up so that 1e-5 works…

Imagine doing something else: you might set delta = 1e-5 * abs(x[j]) for each j. But then if the user evaluates the Jacobian at the origin $x=0$, it will divide by zero. So such code ends up doing something like delta = max(1e-5 * abs(x[j]), 1e-7). No longer scale invariant, and much harder to reason about.

So. In conclusion. Think about the parameters, and try to keep them in sensible units.

1)
Obviously if *all* columns of the Jacobian are zero, you're already at the optimum.
2)
It's quite likely we would need beta, as one reason phase of the moon might be contributing is something to do with werewolf activity, so it will be sharply peaked around the full moon, but we don't know how sharp the peak is. Fitting beta will let us determine that.

## Discussion

, 2020/08/20 19:37

Ou Acheter Cialis 20 Racydayted https://cialiser.com/ - buy cialis and viagra online quetillils Dynamogen Buy Online Uk Zemeenzype <a href=https://cialiser.com/#>cialis online</a> Vemiff Amoxicillin Causes Hearing

, 2020/11/19 15:58

, 2020/11/19 23:21

Drug information leaflet. What side effects can this medication cause? <a href=“https://lyrica2021.top”>online lyrica</a> Some news about medicament. Get here.

, 2020/11/20 03:25

Medicines information. What side effects? <a href=“https://dota2.su”>cephalexin without a prescription</a> Actual information about medicine. Read information now.

, 2020/11/20 05:26

Medicament information for patients. Cautions. <a href=“https://lipitor2021.top”>lipitor pills</a> Best news about meds. Get information here.

, 2020/11/20 10:27

Pills information. What side effects can this medication cause? <a href=“https://fcspam.ru”>cheap seroquel prices</a> All news about drug. Get here.

, 2020/11/20 13:27

, 2020/11/24 13:41

Medication information. What side effects can this medication cause? <a href=“https://medicals.top”>https://medicals.top</a> Actual what you want to know about medication. Read information now.

, 2020/11/25 17:49

Medicines information. Effects of Drug Abuse. <a href=“prilosec4u.top”>where buy prilosec</a> in the USA Actual what you want to know about drugs. Read information now.

, 2020/11/25 21:37

Medicines information. Short-Term Effects. <a href=“https://fcspam.ru”>fcspam.ru</a> seroquel without ption In the USA Everything news about drugs. Read now.

, 2020/11/28 09:45

Medication information sheet. Cautions. <a href=“https://lexapro2020.live”>cost of cheap lexapro pills</a> Everything about medicament. Read information here.

, 2020/11/28 10:38

Drugs information sheet. Short-Term Effects. <a href=“effexor4u.top”>effexor buy</a> in US All what you want to know about medicine. Get here.

, 2020/11/28 12:24

Medicine information leaflet. Generic Name. <a href=“http://jelejewelry.ru”>jelejewelry.ru</a> where to buy prednisone without a prescription In US All about medicines. Read information here.

, 2020/11/28 12:51

Medicament prescribing information. Effects of Drug Abuse. <a href=“https://prednisone.site”>https://prednisone.site</a> Best information about medicine. Read information now.

, 2020/11/28 14:40

Medicament information leaflet. What side effects can this medication cause? <a href=“https://rostovdriver.ru/accupril/where-can-i-get-accupril-online.html”>where can i buy cheap accupril online</a> All information about medicines. Read information now.

, 2020/11/28 15:45

, 2020/11/28 18:04

Medicine information leaflet. Drug Class. <a href=“https://rostovdriver.ru/accupril/where-can-i-get-accupril-online.html”>how can i get cheap accupril price</a> All news about medicament. Get information now.

, 2020/11/28 19:12

Medicament prescribing information. Cautions. <a href=“https://intermedia24.ru”>where can i get cytotec tablets</a> Some trends of medicament. Get here.

, 2020/11/28 20:21

Drugs information for patients. What side effects? <a href=“https://prednisone2021.top”>online prednisone</a> Actual news about medicine. Read information now.

, 2020/11/28 21:28

Pills prescribing information. Effects of Drug Abuse. effexor get https://effexor4u.top in the USA Actual what you want to know about medication. Read information here.

, 2020/11/28 22:31

Medication information sheet. What side effects? <a href=“https://lexapro2020.top”>where to buy cheap lexapro without dr prescription</a> Some about medication. Get information now.

, 2020/11/28 23:41

Medicine information sheet. What side effects can this medication cause? <a href=“https://prednisone.site”>https://prednisone.site</a> Some what you want to know about medication. Get information here.

, 2020/11/29 07:53

, 2020/11/29 10:34

lexapro gain online 0duj <a href=“https://lexapro2020.top/”>buy lexapro without prescription</a>

, 2020/11/29 11:49