# Technology Musings

### General / Finding Derivatives without Limits or Differentials

JB

I always enjoy finding new ways of doing things.  I was reading Henle/Kleinberg's Infinitesimal Calculus, which is a great book.  It has, throughout the book, a main text and a side-text.  The side-text is infinitely more interesting than the main text.

Anyway, on pages 65-66, it gives a method for finding the slope of y = x^2 at a specific point, let's say its at (2,4).  The way you do it is kind of strange, but it seems to work.  You start by defining a line to intercept the equation:

y = mx + b

Now, we don't know what m and b are, but we *do* know what x and y are.  Therefore,

4 = 2m + b

We don't really care about b in the long run, so we want to find an equation for b to remove it from the mix:

b = 4 - 2m

Now our equation for the line becomes:

y = mx + 4 - 2m

Interestingly, we actually have a value for y - y = x^2, so this becomes:

x^2 = mx + 4 -2m

We can rearrange this into quadratic form with:

x^2 - mx + (2m - 4) = 0

Using the quadratic formula, we can do:

x = (m +- sqrt(m^2 - 4(2m - 4))) / 2

x = (m +- sqrt(m^2 - 8m + 16)) / 2

x = (m +- sqrt((m - 4)^2)) / 2

x = (m +- (m - 4)) / 2

Now, we want to take the "+" side of the "+-" since we are solving for "m".  Otherwise, we will lose "m".  Therefore, this becomes:

x = (2m - 4) / 2

x = m - 2

So, at the point we are looking at, what is the value of x?  At (2, 4), x is 2.  Therefore

2 = m - 2

m = 4

Therefore, the slope at (2, 4) is 4.

Interestingly, we can also generalize this to get the full derivative for y = x^2.  To do this, we will introduce the variables p and q to be the specific x and y values at a specific point.  Therefore, the slope at (p, q) for y = x^2 is:

y = mx + b, therefore q = pm + b, therefore b = q - pm

q = p^2, therefore b = p^2 - pm

y = mx + p^2 - pm

y = x^2, therefore:

x^2 = mx + p^2 - pm

x^2 - mx + (pm - p^2) = 0

x = (m +- sqrt(m^2 - 4(pm - p^2)))/2

x = (m +- sqrt(m^2 - 4pm + 4p^2))/2

x = (m +- sqrt((m - 2p)^2))/2

x = (m +- (m - 2p))/2

x = (2m - 2p) / 2

x = m - p

m = x + p

Since p = x, this gives

m = x + x = 2x, which is the derivative of x^2

I don't know how many classes of functions you can do this successfully for, but I imagine it should cover quadratics at least.

### General / Ratios of Infinity

JB

A lot of people do a lot of thinking about whether or not infinities exist, and wonder how real numbers can add up to infinity.

Sometimes I wonder if that sort of thought might be the inverse of reality - perhaps the infinities are the foundational realities, and all finite numbers are merely infinities in ratio.

### General / A Meaning for d^2y/d^2x?

JB

I have recently been randomly curious about random calculus-y things.  Anyway, the notion for the second derivative of a function is d^2y/dx^2.  This is the second differential of y divided by the first differential of x squared.

However, there is, technically, also a d^2x, though it doesn't get much attention.  And, d^2y and d^2x can be put into ratio with each other, but I don't really know what it means.  But it is an interesting operation nonetheless.

So, the derivative of an equation is dy/dx and the derivative of its inverse is dx/dy.

The second derivative of an equation is d^2y/dx^2 and the second derivative of its inverse is d^2x/dy^2.

Therefore, to get the d^2y/d^2x you just do:

(2nd Derivative of y wrt x / 2nd derivative of x wrt y) * (First derivative of the inverse)^2

Or:

(d^2y/dx^2) / (d^2x/dy^2) * (dx/dy)^2

I will post an example later when I have a good one.

### General / The derivative of u^v

JB

I don't know why this isn't listed as a standard rule.  The differential of the exponent function, u^v (u raised to the v power), is pretty basic, and you can use it to formulate the other exponent rules for differentiation. However, for some reason it seems to be left off of most rules for differentiation.

The basic rule is this:

`d(u^v) = v*u^(v-1)*du + u^v*ln(u)*dv`

This can be clearly seen to be a more general form of u^n, because if n is a constant, this becomes:

`d(u^n) = n*u^(n - 1)*du + u^v*ln(u)*0`

And that zero drops it to the rule we all know and love:

`d(u^n) = n*u^(n - 1)*du`

This can be derived as follows:

`z = u^vln(z) = ln(u^v)`

Using log rules, we get:

`ln(z) = v*ln(u)`

Now take the differential of both sides:

`d(ln(z) = d(v*ln(u))`

Differentiating both sides, we get:

`dz/z = v*du/u + ln(u) * dv`

Then we multiply by z:

`dz = z*v*du/u + z * ln(u) * dv`

Substituting in z = u^v:

`dz = u^v * v * du / u + u^v * ln(u) * dv`

Now, u^v/u simplifies to u^(v-1), giving us:

`dz = v*u^(v-1)*du + u^v*ln(u)*dv`

Since z = u^v, dz = d(u^v), so

`d(u^v) = v*u^(v-1)*du + u^v*ln(u)*dv`

### General / Explanation of Liebniz Notation of the Second Derivative

#### JB

The notation for the second (and higher) derivatives in Liebniz notation has always troubled me.  The second derivative is usually notated as d^y/dx^2.  And I always wondered, why is the the 2 in relation to the d on the top, but in relation to the whole term on the bottom?

This puzzled me for a while, and I looked through at least 10 Calculus textbooks to find the answer, all to no avail.  Finally, I put pen to paper and figured it out.  The answer was straightforward, and I am sure that whoever invented the 2nd derivative Liebniz notation knew why they did it that way, it's just that every Calculus book since then seems to have forgotton.

Anyway, the usual notation for the derivative operation id d/dx.  I eventually came to realize that this is not a single operation, but TWO operations in one.  The first is to take the differential (not derivative) of the equation.  For that, I mean, let's say you have the equation y = 2x.  The differential of that equation is dy = 2xdx.  Doing a differential instead of a derivative is powerful, because it allows a more natural approach to both implicit differentiation and multivariable differentiation.

Now, the derivative is simply the differential divided by dx.  So, if our differential is dy = 2xdx, then when we divide by dx we get our normal notation dy/dx = 2x.

Now, let's leave off dividing by dx, and just look at the differential.  What happens if we take a second differential?

In this case, we have to treat dy as a separate variable.  So, if the differential of y is dy, what is the differential of dy?  This is where the d^2y comes in.  The 2 is by the "d" not because it is "d squared", but it is "d applied twice".  So, if we take our initial differential dy = 2xdx and take another differential, what do we have?

By the product rule, we get:

d^2y = 2((x)(d^2x) + (dx)(dx))

Now, since we want the second derivative, we divide both sides by dx twice (once for the first derivative, and once for the second), which is the same as dividing by dx^2.

That gives us:

d^2y / dx^2 = 2((x)(d^2x) + (dx)(dx)) / dx^2

The (dx)(dx) simplifies to dx^2, and we can separate this out at the plus sign to give us:

d^2y / dx^2 = 2(x)(d^2x)/dx^2 + 2(dx^2)/dx^2

This further simplifies to:

d^2y / dx^2 = 2(x)(d^2x)/dx^2 + 2

Now, we know that the second derivative is 2, so that means that 2(x)(d^2x)/dx^2 must equal zero.  But why?

Well, let's rewrite this term a little:

2x * (d^2x)/(dx^2)

d^2x/dx^2 is simply the second derivative of x with respect to itself!

To understand why this must be zero, first imagine the first derivative of x with respect to itself: dx/dx

This must always be 1!  Another way of stating this is that "for every change in x, the corresponding change in x is equivalent", which seems obvious.

Now, 1 is a constant, so the second derivative must be zero!  This zeroes out the whole term, leaving:

d^2y/dx^2 = 2

Thus, we have the notation of the derivative from first principles.

Another way to look at this is to use the quotient rule.

If we start with dy/dx, then take the differential, what do we get?

((dx)(d^2y) - (dy)(d^2x)) / dx^2

If we separate, we get:

(dx)(d^2y) / dx^2 - (dy)(d^2x)/dx^2

The first term simplifies to d^2y / dx.  The second term is our second derivative of x with respect to itself again, so it becomes (dy)*0, giving us:

d^2y/dx - (dy)*0

Which gives us just

d^2y/dx

Now, the second step of the derivative is to divide by dx.  Doing so gives us:

d^2y/dx^2

Which is our second derivative!

### General / A General Solution for Line Integrals

JB

Update: I found that this is a standard formula, but it is referred to as an "arc length" formula, not a line integral.  I knew it had to be somewhere, I was just searching the wrong terms.

I was playing around last night, and I came up with, what appears to be, a general solution for line integrals.  I have looked around and I have not found this equation anywhere, although I am certain in must be well-known somewhere.  In any case, I will give it here, because it is definitely not well-known on the Internet.

This is an equation for determining the line integral of y = f(x), where f(x) is a true function of x (i.e. it passes the vertical line test), and is differentiable on the portion of the line where you want to find the length of the line segment.  Most line integrals work by adding in an auxiliary parameter, t, to the equation.  This operates directly on the function itself.

The short version:

To get a line integral from x=a to x=b on y=f(x), we can use the following formula: Evaluate this from x=a to x=b and you will have the length of the line.  Note, the software I was using gives this as a partial derivative on the inside.  I don't know enough to know if it is only limited to partials or not.

So, if you know the derivate, you can figure out (in theory) the line integral.  Whether or not this actually helps you find a solution to the integral is another story.

Now, how does one arrive at that?

Well, first, remember, that an integral is an infinite sum of infinitesimals.  A curve is simply an infinite sum of tiny, infinitesimal lines.  So, all we need to do is find the length of each line segment and add it up.

How big are our line segments?

Well, if y=f(x), then the line segment is going to be the line between the points x, y and x + dx, y + dy.  The distance formula states that the line length is sqrt( (x1 - x0)^2 + (y1 - y0)^2).  The difference between x + dx and x is simply dx, and the difference between y + dy and y is simply dy.  So, each of our infinitesimal line segments (dL) will be:

dL = sqrt(dx^2 + dy^2)

If we square both sides, we get the following:

dL^2 = dx^2 + dy^2

dL^2 - dx^2 = dy^2

Now we can divide both sides by dx^2 and get:

dL^2/dx^2 - 1 = dy^2/dx^2

dL^2/dx^2 = 1 + dy^2/dx^2

dL^2/dx^2 = 1 + (dy/dx)^2

Now we can square root both sides:

dL/dx = sqrt(1 + (dy/dx)^2)

dL = sqrt(1 + (dy/dx)^2) dx

And this is the final form of our equation.  I tested it out with a few simple integrals (a line and the top half of a circle) and it seems to work as expected.  And, as I said, if you have a line that loops back around, you still need to use the parameterized methods.  But this is an easier way to setup line integrals where f(x) is a true function of x.

### General / Why You Shouldn't Put Validations on Your Model

JB

This post is about model validators in Ruby on Rails, although it probably applies to validators in most similar MVC systems that allow declarative model validation.

One of the biggest disasters, in my opinion, has been the trend for developers to put more and more validation in their models.  Putting validation in the model always *sounds* like a good idea.  "We want our objects to look like X, therefore, we will stick our validations in the place where we define the object."

However, this leads to many, many problems.

There are two problems, and the most general way of stating them is this:

1. life never works the way you expect it, but validating on the model assumes that it will.
2. while it is cleaner to validate on the model for a given configuration, when that validation needs to change, you don't always have enough information to understand the purpose of the validation in the first place (i.e., if a new developer is making the change)

Example of both -

Let's say that we require all users to enter their first name, last name, birthdate, and email.  Okay, so we'll put validators on the model to make sure that this happens.

Let's say that six months go by, and your company signed an agreement with XYZ corporation to load in new users.  However, XYZ corp didn't require that their users enter in their birthdate.  However, you don't know that, and the bit of data you looked at, it looks like they are all there.  So, you do a load through the database, and everything loads in cleanly, and you think that everything is good.

In fact, *Everyone* can view their records.

However, the next week, people start getting errors when they try to update their profile.  Why?  Because the people who don't have a birthdate entered are getting validation errors when they try to update something else.

So here is problem #1 - if an existing record doesn't match the validation, nothing can be saved, and this causes headaches because you are getting errors in places where they were unexpected.  This doesn't just happen with dataloads.  It happens every time you add a validator as well.  So, every time you change policy, even if it seems benign, you have the potential of messing up your entire program, even if all of your tests succeed (because your tests will never have had the bad data in them, since it doesn't have the old version of the code!).

So, we go back and recode, and remove the validator from the model.  Great, but let's say this is a new programmer.  Now the programmer has to figure out where to put the validation in.  Presumably, we still want new users to enter in their birthdate, even if we allow non-birthdate users that we load from outside.  However, now, because we aren't validating on the model, we have to move it somewhere else.  But where?  Since the validator just floated out there declaratively, there is no direct link between the validator and *where* it was to be enforced.  Now the new programmer has to go through and recode and retest all of the situations to figure out the proper place(s) for the validator.  This is especially problematic if you have more than one path for entrance on your system.  It's not so bad if it is a system you wrote yourself, but when having to maintain someone else's code, this is a problem.

Therefore, the policy I follow is this - only use model validation when a failure of the validator would lead to a loss of data *integrity*, not just cleanliness.  The phone number field should be validated somewhere else.  You should validate a uniqueness constraint only if uniqueness is actually required for the successful running of the code.  Otherwise, validate in the controller.  Now, you can make this easier by putting the code to validate in the model, but just don't hook it in to the global validation sequence.  So, go ahead and define a method called "is_proper_record?" or something, but don't hook it up to the validation system - call it directly from your controller when you need it.

NOTE - if I don't get around to blogging about it, there are similar problems with State Machine, and a lot of other declarative-enforcement procedures in the model.  Basically, they are elegant ways of coding, but *terrible* when you need to modify them, especially if the person modifying the code wasn't the one who built it.  Untangling the interaction between declarative and manual processes leads to both delays (as the new developer tries to figure out the code) and bugs (when the new developer doesn't figure out the code perfectly the first time).  When your processes are coded *as processes* and not declarations, it is more visible to future programmers, and therefore more maintainable.

### General / New Book Integrating Engineering, Philosophy, and Theology

JB

I thought some of you, my faithful blog readers, might be interested to know about a project I just completed with some other people - a new book about integrating engineering, philosophy, and theology.  It is titled Engineering and the Ultimate: An Interdisciplinary Investigation of Order and Design in Nature and Craft. ### General / Knowing When to Start Over

JB

A friend of mine shared this following quip on FB:

Quinn's design tip o' the day: You can't iterate your way out of local maxima. In other words, no amount of horse breeding will produce a Model T.

Sooner or later, you just have to throw it all away and start over.

### General / Today's Thoughts on Algorithmic Information Theory

JB

I know this won't make a lot of sense to most people, but I thought I needed to store these thoughts in a more permanent place before I forget them.  I've been reading a paper today on Algorithmic Information Theory, and it make me think these thoughts (most of which are probably ill-founded):

1. Is the Chaitin halting probability really a probability?  In other words, if I have the first 5 bits of omega, does this mean that a random sampling of all programs will really match this value to some extent?
2. To what extent is self-delimiting programs an intrinsic part of AIT, vs just a cheap trick to get the math right?  It makes sense simply because you need the program length.  But on the other hand, if omega is a probability, it seems like having the program length pre-coded would do a number on its interpretation as a probability.
3. If log2(2^w) is w, is perhaps log2(w) = c?  Is c the complexity number I'm looking for for my axioms?
4. Is it possible to express the idea of a formal relationship to be the relationship between the algorithmic specification and its logical depth?  I.e. we should expect a material cause when logical depth is low, but a formal one when logical depth is high?  Could this be expressed as a ratio?
5. Algorithmic complexity on partial specifications.  The word "red", as implemented in 12 pt font or written in the clouds.  Perhaps the partial specification is something that can be detected non-algorithmically, via an Oracle machine?  In other words, instead of "give me the Xth object that matches specification X, you say, give me *an* object that matches specification X.  It does not require an iteration over the whole of infinity, but a potentially large part of it very quickly.  Or, perhaps, it drops it a level of Cantorian infinity.
6. Recognizers vs generators = P/NP problem = logical depth vs algorithmic complexity
7. What is the relationship between the constant given in AIT vs expressibility of language?  In other words, if C is small, then it is easier to express random strings, at the penalty of having no easily expressible strings.  A larger C will permit more compressibility.  This might also be usable, at least in theory, to determine the total number of "special" strings available.
8. How can algorithmic information theory be expanded for programs that take arguments? (note - number of arguments shouldn't matter, only whether one of them is itself a program, perhaps? or if the result is a program?  Maybe what the Wolfram class of the argument is?)
9. Might semantics and apobetics be higher Turing degrees?
10. Might ethical norms be an expression of compressibility across Turing degrees?
11. What about bad plans?  Is it possible to state, even in abstract, non-determinable terms, what a bad plan is?