## Saturday, October 13, 2012

### Digital Roots Revisited: When Not To Get Carried Away With Them

In this earlier post, I talked about three properties of digital roots.  In simple mathematical terms, if the digital root of a number a with respect to base n is expressed as Dn(a), I provided a proof of the following properties of digital roots:
• Dn(a + b) = Dn(Dn(a) + Dn(b))
• Dn(a - b) = Dn(Dn(a) - Dn(b))
• Dn(a * b) = Dn(Dn(a) * Dn(b))
While working with my children on some brain-teasers, I thought I had come across another property of digital roots that I thought was quite interesting.  To set up the background, the problem I posed to my children as as follows:

Find the digital root with respect to 9 (i.e. the traditional definition of digital roots) of 1777^1777 (where ^ denotes the operation of raising to the power of).  I was not sure my children knew the properties of digital roots, but not only did they know the three above, but they used another one which I was not sure was even true.

The way I would solve the problem I posed, I would calculate D9(1777) as 4.  I know that the digital root of 1777*1777 is 7 because that is the digital root of 4*4, which is 16.  I also know that the digital root of 1777^3 would be 1 (the digital root of both 7*4 or 4*4*4 is 1).  The digital root of 1777^4 would be 4 (the digital root of 1*4 or 4*4*4*4 are both 4).  So, the pattern of digital roots as we raise 1777 to different powers is 4, 7, 1, 4, ... .  The pattern has a period of 3, so I would divide 1777 (the exponent) by 3 to get a remainder of 1.  This would then point me to the first number in my series of digital roots (4) as the answer to my problem.

My children, though, got the better of me.  They arrived at this answer well before I could.  How did they do it?  They took the digital root of both 1777's, and decided to find the digital root of 4^4 as the solution to my problem.  That gave them D9(256), which is 4.

So, is it true that D(a^b) = D(D(a)^D(b))?  Did I miss an important property of digital roots that can make some problems simpler than the method I was using for them all along?  I decided to verify if I could prove the correctness of what my children did, and this post is the result of that effort.

I started working out the proof using n as the base of my digital root for the sake of generality.  If that works, then we can be assured that it will work for the traditional definition of digital root, which uses 9 as the base.  But the general nature of the proof will give us confidence that it can be used with other bases also.

Let a = nw + x
Let b = ny + z

So, Dn(a) = x, and Dn(b) = z.

Let a^b = c.  Thus c = (nw+x)^(ny+z).

Now, we know that p^(q+r) = p^q * p^r.

So, we can write c as (nw+x)^ny * (nw+x)^z.  This is a product of two terms.  We can use the application of digital roots to multiplication to say that Dn(c) = Dn(Dn((nw+x)^ny)*Dn((nw+x)^z)).  Don't let the complexity of the expression throw you off.  It is basically a rewriting of D(a * b) = D(D(a) * D(b)), with a being equal to (nw+x)^ny and b being equal to (nw+x)^z.

Now, let us look at the number (nw+x)^ny.  This is basically (nw+x)*(nw+x)*(nw+x)*..., ny times.  When you expand out this product, you will find that all the terms in the answer except x^ny contain nw in them, and are therefore divisible by n without a remainder.  So, they don't contribute to the digital root at all.  So, Dn((nw+x)^ny) = Dn(x^ny).  By a similar logic, we can easily see that Dn((nw+x)^z) = Dn(x^z).

So, Dn(c) = Dn(Dn(x^ny) * Dn(x^z)).

Now, for my proof to work, I need to get Dn(x^ny) to be equal to 1 so that Dn(c) = Dn(Dn(x^z)).  I actually got stuck at this point.  And it was not from lack of trying.  I tried everything possible to try to prove that Dn(x^ny) = 1.  I was so brainwashed into thinking that the property was true, and it was my ability to prove it that was lacking that I wasted a lot of time mulling over the problem.  Basically, given the generality of x, n and y, this would mean that when you raise any number to a multiple of the digital root base, the digital root of the result should be equal to 1.

This is certainly true for some numbers like 4 (the digital roots with respect to 9 of whose powers cycle through 1 at multiples of 3).  But it is not true for numbers such as 2 (2^9 = 512, whose digital root is 8), 5 (5^9 = 1,953,125, whose digital root is 8) or 8 (8^9 = 134,217,728, whose digital root is 8).

I then used this knowledge to come up with a simple problem which actually disproves the property.  If you try to use this for 2^18 using 9 as a base for your digital roots, you will immediately see that the property does not work.  The digital root of 2 is 2 and that of 18 is 9.  So, the digital root of 2^18 should be the same as the digital root of 2^9 if this property were true.  But, unfortunately, the digital root of 2^9 is 8 and that of 2^18 is 1.

In fact, it turns out that when you use 9 as your base for digital roots, the property is true only when the digital root of the number being raised to powers (a in our case) is either 1, 4, 7 or 9.  It does not work for any other numbers.  You can verify that it does not work by calculating the digital roots of 2^10, 3^10, 5^10, 6^10 and 8^10 (if this property were true, the digital roots of these numbers should be 2, 3, 5, 6 and 8 respectively, and you can easily verify that they are not).

So, my children had gotten lucky because I asked them for 1777^1777, and 1777 has a digital root of 4.  And 4 just happens to be one of the few numbers for which the property actually works (at least as far as digital roots with respect to 9 go.  It does not work even for 4 when you work with digital roots with respect to other bases such as 7, for instance).

So, when you are asked to work out the digital root of a^b, where a and b are large numbers, you have to work it out the traditional way by figuring out the series of digital roots of powers of the digital root of a, find their period, divide b by that period to find the remainder, and then figure out what digital root that corresponds to.  You cannot short-circuit the process by taking the digital roots of a and b, and using them in an exponentiation operation to come up with the answer.  Yes, it will work if the digital root of a is 4 and you are working with a base of 9, but won't work with a lot of other numbers!

What is the point of all this then?  The point is that, sometimes, things work because they are special cases, not because they are true in general.  When your first encounter with something is that special case, you may be tempted to conclude that it works all the time and draw a generalization from it.  The point of this post is that it pays to be cautious.  If possible, work the math out and convince yourself with a proof that the case you encountered is not special in some way.  Don't get carried away with a generalization that is not justified.  Good luck!

## Thursday, June 28, 2012

### Microsoft Access Tips & Tricks: Fun With Crosstab Queries

In this earlier post on crosstab queries, I explained the basic syntax of crosstab queries in Access.  I also provided several examples to illustrate how to use crosstab queries for the purpose it is commonly used for:  create summary tables of data on two dimensions at row and column intersections.  However, crosstab queries can also be used for some unconventional and downright fun purposes.  In this post, I am going to talk about a few such uses and provide some examples.

If you are interested, you can find my earlier posts on finding the median, the mode, the geometric and harmonic means, ranking every row in a query, selecting random rows out of a table, calculating running sums and averages, calculating running differences, creating histograms, calculating probability masses out of given data, calculating cumulative distributions out of given data, finding percentile scores, percentile values, calculating distinct counts, full outer joins, parameter queries, crosstab queries, working with system objects, listing table fields,finding unmatched rows, calculating statistics with grouping, job-candidate matching, job-candidate matching with skill levels, great circle distances, great circle initial headings, using Excel functions in Access, using the windows file-picker, using the Access date-picker, setting tab indexes quickly and correctly, pre-filling forms based on previous entries, highlighting form controls, conditional formatting, performing date manipulations, counting letters, words, sentences and paragraphs, calculating age, propercasing text entries, flattening a table (using SQL), flattening a table (using VBA), cascading comboboxes, parsing file names, opening files from within Access, and identifying runs of data.

Microsoft Access has a built-in query wizard for crosstab queries that makes creating such queries for conventional purposes quite easy.  But this wizard has some limitations.  In particular, the wizard will only work with tables that have 3 or more fields.  It needs you to select at least one of these fields as row headers, another as column headers and the third field as the data to summarize at the row/column intersections.  Technically, the wizard is supposed to work with queries as well as tables, but I have sometimes had trouble choosing a query rather than a table to work with when using the wizard.  So, I have gotten used to writing crosstab queries from scratch using the SQL view of the query design window.  I prefer the SQL view for all kinds of queries, but I find it particularly useful for writing unconventional queries that the wizards or design view do not handle well or at all.  All the queries in this post are actual SQL that you would enter into the SQL view of the query design window.  You will not have much luck creating them using the built-in wizards in Access.

Addition, Multiplication and Other Tables:  The first unconventional use of crosstab queries I am going to present is their use for the presentation of multiplication, addition and other "tables".  Everyone should be familiar with multiplication tables where a set of numbers is multiplied by another set of numbers, and the results are presented as successive rows for memorization by children (or anybody else, for that matter) who need to learn the results by heart.

Now, it is easy to create a non-crosstab query to produce multiplication tables in Access.  For instance, let us say you need to create multiplication tables for multiplicands from 1 through 11.  You could create a table called Numbers, which has a numerical field called Multiplicand, and fill it with the 11 numbers from 1 through 11.  You would then use the query below to create a conventional-looking multiplication table:

```select N1.multiplicand,
" X ",
N2.multiplicand,
" = ",
N1.multiplicand*N2.multiplicand as Product
from Numbers as N1, Numbers as N2 ```

This would produce output that looks as below:

```1 X 1 = 1
1 X 2 = 2```

And so on.  You can use an ORDER BY clause in the query above to change the order in which the multiplicands change if you have a preference for which multiplicand changes first in a multiplication table.

Now, note that I have used a cartesian join in the query above.  There is no ON clause in the join in that query:  I just put the tables I want to pull records from in the FROM clause of the query, separated by commas.  This causes every record in the first table to be joined with every record in the second table.  Be very careful when you do this because you can inadvertently end up creating millions or billions of records in the result set if you join two large tables using a cartesian join by mistake.

The other thing to note is that I joined the Numbers table with itself in the query.  I used aliases for each version of the table so that I could refer to their fields unambiguously.  This is called a self-join.  So, I created this basic multiplication table using a cartesian self-join.  You can use the same principle to create other kinds of tables which use the two numbers to come up with a result, not necessarily just the product.

If you use a crosstab query to create your multiplication table, you can get a much more compact representation of the results in the form of a square grid in which there are rows and columns that contain the multiplicands, and the product is in the cells of the grid.  The crosstab query that will allow you to do this is presented below:

```TRANSFORM N.Multiplicand*N1.Multiplicand AS Product
SELECT N.Multiplicand
FROM Numbers AS N, Numbers AS N1
GROUP BY N.Multiplicand
PIVOT N1.Multiplicand
```

The results of running the query would look like the picture on the left.  The important thing to note is the complete absence of any aggregate function like avg(), count(), etc., in the crosstab query above.  The standard syntax of a crosstab query requires that you use an aggregate function to fill in the grid created by the columns and rows.  But you can apparently flout the rules, and create non-standard crosstab queries like this in Access without Access complaining or producing a syntax error.  That is why this post is titled "Fun With Crosstab Queries"!

You will notice that in a commutative operation like multiplication, you get a symmetric matrix where the upper triangle reflects the lower triangle of the matrix.  You can get rid of the redundant elements of the matrix and make it an upper or lower triangular matrix by trying a variation of the query like below:

```TRANSFORM N.Multiplicand*N1.Multiplicand AS Product
SELECT N.Multiplicand
FROM Numbers AS N, Numbers AS N1
WHERE N.Multiplicand>=N1.Multiplicand
GROUP BY N.Multiplicand
PIVOT N1.Multiplicand
```

The results would then look like the figure on the right.

Obviously, if you are using this technique to create a table for a non-commutative operation (such as N1 raised to the power N2, or N1 divided by N2), then you should not use a WHERE clause to limit the results of the cartesian join.

Also, instead of using a self-join, you can use two tables of multiplicands that contain totally different sets of numbers.  So, if you wanted a multiplication table of the numbers 25 through 34 multiplied by 1 through 10, you would join a table that contains the numbers 25 through 34 with a table that contains the numbers 1 through 10.  Either that, or you can put the numbers 1 through 34 in one table, and use a cartesian self-join as before, but use the WHERE clause to limit the values of the two sets of multiplicands.

You can use this technique to create and keep handy a table that lists the decimal values of quarters, eighths, sixteenths, thirtyseconds, sixtyfourths, etc.  Part of such a table is illustrated on the left.

Create A Calendar:  Here is another fun application for crosstab queries that you can customize as you see fit.  For the purpose of this query, you can use one table that contains the numbers from 1 through 31 (this table will then be used for both dates and month numbers), or you can use a table with the 12 month names in a separate table in addition to the table with the dates from 1 through 31.

Now, as we all know, not all dates occur in all months.  In particular, February does not have a 30 or 31, and has a 29 only in one out of four years.  Similarly, there is no 31st of April, June, September or November.  When we join the table of dates with the table of months (or use a self-join of the date and month numbers table with itself), how do we limit the results to just valid dates?  We can use the Access built-in function called IsDate() in the WHERE clause of the query to limit results to just valid dates.

We can then use the Format() function to derive the day of week name for each of the valid dates and use that to populate the grid.  If you use DatePart() instead, you will get day of week numbers rather than day of week names.  Once again, no aggregate functions at all!  If you want to use a single table with both dates and month numbers, use the query below:

```TRANSFORM iif(isdate(Months.DateNum & "/" & Dates.DateNum & "/2012"),
format(Months.DateNum & "/" & Dates.DateNum & "/2012","ddd"), "") AS DayOfWeek
SELECT Dates.DateNum
FROM DatesOfMonth Dates, DatesOfMonth Months
WHERE isdate(Months.DateNum & "/" & Dates.DateNum & "/2012")
GROUP BY Dates.DateNum
PIVOT Months.DateNum
```

If you want to create a separate table with month names in some format you prefer, then use the query below.  This creates the calendar as shown on the left.

```TRANSFORM iif(isdate([DateNum] & " " & [MonthName] & " 2012"),
format([DateNum] & " " & [MonthName] & " 2012","ddd"), "") AS DayOfWeek
SELECT DatesOfMonth.DateNum
FROM DatesOfMonth, MonthsOfYear
WHERE isdate([DateNum] & " " & [MonthName] & " 2012")
GROUP BY DatesOfMonth.DateNum
PIVOT MonthsOfYear.MonthName In
("January","February","March","April","May","June",
"July","August","September","October","November","December")
```

Notice that in the second query, I use the IN sub-clause in the PIVOT clause to order the months from January to December rather than alphabetically, which is the default sort order for the column headers in a crosstab query.  Also notice that I use an IIF() to limit the results to only valid dates.  For some reason, the grid produces a day of Tuesday for all invalid dates even though the WHERE clause of the query already limits the dates produced by the join to just valid ones.  It is one of those bugs/features of Access you just have to work around!

You can change the year (which is hard-coded to 2012 in both queries) to any year you want to see what a calendar for that year looks like.  Have fun, and check out a calendar for the year 5783 if you are curious!

Hope you found this post useful and fun.  SQL is very versatile, and this post explored some unconventional uses of crosstab queries.  Have you used a crosstab query to do something it was not designed to do?  Have you used any other SQL construct to achieve something that it was not meant to?  Let me know in the comments section.  If you have any problems or concerns with the SQL in this post, please feel free to let me know by posting a comment.  Let me also know if you want me to address some other aspect of Microsoft Access in future posts.

## Tuesday, June 19, 2012

### Why The US Should Spend Its Way Out Of This Recession . . . And Why It Won't

So, the US Congress and the President are locked in a battle over the budget priorities that should take precedence right now.  President Obama and the Democrats want to stimulate the economy with more government spending to create more growth and jobs so that the economy can grow vigorously and unemployment can be brought down.  The Republicans want to hold the line on spending, and rein it in even further so that debt levels go down or at least stay where they are, regardless of such austerity's effects on the economy.  Their argument is that if the government gets its debt house in order, that will encourage private investment, resulting in economic growth and job creation.

But, is it really such a stark choice?  What if there is a way to spend money to goose the economy without worrying about the debt spiraling out of control?  Maybe, there is actually such a way, and it is getting lost in the ideological battle going on at the highest levels of the government.

Last I checked, the US government-issued 10-year treasury notes (T-Notes) is 1.62%.  Also, last I checked, the inflation rate of the US in the last 12 months was 1.7%.  It was actually an annualized 2.3% in April, but because of a sudden steep fall in fuel prices, the May inflation numbers came in much lower than expected.  The average US inflation rate over the past 100 years has been about 3.4% per year.

What does all this mean?  It means that the real interest rate of a 10-year T-Note right now (actual interest rate - inflation rate) is actually negative.  So, investors are paying the US government money for the privilege of lending the US government money!  For every dollar the US government lends out at this rate, the government actually makes money!!

What would you personally do if people from around the world lined up and offered to pay you money to borrow from them?  Would you not be raking in the money hand over fist as fast as you possibly could?  Unfortunately, it does not happen to you or me.  Ever.  In fact, it almost never happens for governments either.  But the US is in an extremely sweet spot right now.  It is large and stable.  There are no systemic risks like there are with its only global competitor, the Euro-Zone (which, by the way, could probably take advantage of such interest rates if they decided to issue Euro-Bonds instead of individual country-backed bonds they have stuck with so far).

Obviously, once the US does start taking advantage of these negative real interest rates, the interest rates will start moving up slowly.  That is just the way the law of supply and demand works.  Right now, there seems to be a severe shortage of US 10-year T-Notes in the world, so people are willing to pay a very high price for every one issued, making the absolute interest rate miniscule, and the real interest rate negative.  Once the supply of US 10-year T-Notes goes up, its price will fall making the absolute interest rate higher.  Eventually, the real interest rate will become zero and then positive.

But, right now, the window of opportunity is still open.  This is what I would do if I were in the enviable position the US government seems to be in right now:  I would issue as much debt as I could at negative interest rates.  I would stop issuing debt only when the real interest rate on the debt becomes positive.  How much debt could the US issue at a negative real interest rate?  Who knows?  But it will probably be in the tens or hundreds of billions of dollars.

That money can then be used to stimulate the economy and create jobs.  Rebuild infrastructure.  Spend money on education, job-training, and research and development.  The effects on the economy and the unemployment rate can be quite immediate and dramatic.  Just the knowledge that the government is prepared to spend money on the economy is enough to spur private businesses that are sitting on the fence into growing and hiring.  Businesses that do not grow when the economy grows will be left behind, and businesses know that.

Obviously, fiscal hawks find all this talk of government spending quite distasteful.  But there are two reasons it should not be distasteful.  Firstly, as mentioned previously, the government is going to be making money on every dollar it lends out.  In fact, as the economy picks up steam inflation is likely to go up too, so the government can afford to borrow money at higher absolute rates and still make money in the process!

Secondly, this course of action should be embarked upon with the full knowledge that it is not going to go on for ever and ever.  In fact, there should be strict limits on when the government can engage in this.  First, the real interest rate has to be less than or equal to zero.  Secondly, the spending should be wound down as the growth rate of GDP exceeds a certain number and the unemployment rate falls below a certain number.

When the second condition is achieved, the economy is already on solid footing and the government does not have to support it with spending anyway.  At the same time as the government spending on these stimulus programs is winding down, government receipts increase because of increased tax revenues.  These extra revenues should then be used to pay down the debt that was accumulated during the economic weakness.

Ideally, the process should have worked in reverse:  the government should have accumulated a reserve of money in the form of fiscal surpluses during economic good times.  These reserves can then be used to stimulate the economy when such stimulus is needed, with borrowing being a last resort only if the reserves are found to be inadequate.  And the reserves should be rebuilt once the economy recovers.

After all, this is what most people with common sense do.  It is not rocket science.  When you are in a good job and can afford to save money, you are supposed to lay away some money for a rainy day.  When you hit a bump in the road in the form of unexpected expenses or a job loss, you are supposed to access your savings to get over the rough spot.

Unfortunately, just like the US government, people have been ignoring this for the longest time.  They have lived on borrowed money (credit cards, home equity loans and the like) instead of saving money for a rainy day.  They have hit their individual debt ceilings, and then broken through them with new credit cards and more borrowing.  Now, they have no cushion to fall back on when they face a real need, such as a job loss or unexpected medical expense.  Bankruptcy and foreclosure result.  Unfortunately, the average Joe cannot borrow money at a negative real interest rate to tide over tough economic times.

But, the government still has the wherewithal to weather this without further pain.  It can take advantage of record-low interest rates to finance a robust recovery spurred by government spending.  The question is not whether the economy can be stimulated in the short term at practically no cost right now.  The question is whether they will get fiscal religion, and start doing what needs to be done to clean their fiscal house for the long term.  Will they start paying down the debt and putting down reserves in advance of the next economic downturn?

The Democrats have tended to break the bank in the past by not curtailing government spending when the need for it no longer exists.  Short-term problems tend to get saddled with long-term solutions that are wasteful and create government spending long after the need for such spending is gone.  Will they have enough discipline to adhere to strict conditions on when the borrowing and spending needs to stop and the paying down of debt needs to begin?

The Republicans have tended to break the bank in the past by doling out tax breaks (and getting into unnecessary and expensive wars).  There is nothing wrong with tax breaks as long as the government's needs are taken care of before the tax breaks happen.  Has the debt been paid off and a sufficient reserve built up?  At that point there is nothing wrong with tax breaks.  But short-changing the government to give tax breaks to the wealthy is not a sound idea.  Government reserves are not a waste of money.  They are necessary to tide over the troughs in the economic cycle that are inevitable.  Hamstringing the government from being able to provide such stimulus will only make these troughs longer and deeper.  Will they have enough discipline to adhere to strict conditions on the size of the reserves required before tax cuts can begin?

Assuming the two sides are mature enough to make the commitments above in the spirit of true cooperation, the way out of this long recession is clear, and could be quite an easy path.  Unfortunately, the maturity of both sides is questionable.  Even if they could make the commitments required, their ability to live up to those commitments is suspect.  So, perhaps the US is doomed to muddle its way out of this recession much more slowly than necessary.  What a shame.

But I have always believed that in a democracy, the people always get exactly the government they deserve.  Different people can blame different parts of the government for their plight, but ultimately, it is a government of them, by them and for them.  If they cannot agree on who to blame, is it any wonder their representatives cannot agree on how to get the economy moving again?

## Monday, June 11, 2012

### Unclear On The Concept

I was reading a news article over the weekend about how nutritionists are changing the popular view that you might suffer gross bodily harm if you don't drink 2 liters of water a day every day of your life.  I was reading it primarily because I have never followed what seemed like bad advice from the beginning.  I was a strong believer in responding to signals that a healthy body produces, such as thirst and hunger, before blindly stuffing myself with excessive amounts of food or water or anything else for that matter.  So, I was happy that this article vindicated my approach to taking care of the fluid needs of my body.

What I was surprised by was the number of comments beneath the article that not only seemed to demonstrate that people were ignorant about how the human body works (which is understandable in that not everybody wants to go into great technical depth about how their body works), but more importantly, about how science in general works.

There were several comments complaining about how scientists keep changing their views and recommendations, and how new research seems to invalidate a lot of older research results.  To me, all this seems completely natural and the way it should be.  But this seems to make a large number of people very uncomfortable.  People want absolute certainty in life, and science does not seem to want to oblige!

To me, the ability to change and evolve constantly is what makes science valuable.  Science advances only when old "truths" are refined or modified or completely set aside by new scientific findings.  The advancement of science does not mean that scientists in times past were wrong or stupid.  They did the best they could with the tools and techniques of their times.  New tools enable scientists to observe and quantify new things that may invalidate older observations.  Scientific techniques also evolve, making observations and measurements more accurate and reliable.  And, last but not least, scientists are only human:  so, sometimes they make mistakes that are not caught right away.

And let us not even get into the arena of pseudo-science, where corporations and other interested parties buy "scientists" to produce spurious results that favor their viewpoints.  The best known example of this kind of "science" is the effort that companies like Exxon-Mobil undertook as part of their corporate policy to create fear, uncertainty and doubt (FUD) in climate-change research.  Other examples of this are the large number of websites spouting absurd hypotheses about the health effects of whatever they are touting (usually some miracle supplement that makes you rich, handsome and healthy while creating world peace and solving world hunger), or dissing (genetically modified crops, food additives, vaccination, or whatever else catches their diseased imaginations).

Because of this, I can understand why some non-scientific people find the whole scientific process suspect and unsettling.  Answers change all the time.  It is difficult, if you don't have a good scientific background, to know what to believe and what not to believe.  But instead of making an effort to understand science so that they can appreciate the progress science has made, or evaluate "scientific" claims in a more balanced way, many people seem to want to vent their anger and frustration at their own ignorance on science and scientists.

Perhaps, this is one of the reasons for the popularity of religion compared to science.  After all, religion is the exact anti-thesis of science:  nothing changes, everything is certain.  The old is never replaced by new (either in thought or action).  There is no need or attempt to verify that what is presented as truth is actually true.  There is no need for painstaking research.  There is a well-organized hierarchy of religious figures (pastor, priest, bishop, cardinal, pope, etc.), and when there is a conflict of views, you always know who is correct (the one higher up in the hierarchy) and who is wrong (the one lower down in the hierarchy).  And you can safely ignore religious figures who are not part of your religion, so that makes it even easier to find and follow "the truth".

Well, here is my attempt at putting in words my thought-process when it comes to evaluating scientific claims, whether it be about your health or the health of the planet or the state of the universe:

•  Learn a little about how science works.  Science is all about making valid connections that are true regardless of who tries to make the connection.  Science is repeatable.  Scientists publish their methods and results, and other scientists have to verify that when the method is repeated, the results also repeat (heard of cold fusion lately?)
• Learn a little about how the world works.  The internet has made science so much more accessible than it used to be.  You don't have to go hunting for books at a library and wait for years before the latest science is available in print form.  Websites like wikipedia, howstuffworks, etc. make scientific concepts easy to understand.  They also make it easy for everyone who has the inclination to understand the broad principles behind any field of science, whether it is human physiology or geology or astronomy.
• Who is the scientist making the claim?  What are his/her qualifications?  Does he/she have a track record of publishing peer-reviewed scientific papers in famous scientific journals in the field?
• Who is paying for the research?  Is there a hidden agenda?  This can be hard to find out.  If the claim is published in a famous scientific journal, usually such financial ties must be disclosed.  But if the "scientist" just sets up a website to broadcast his agenda, he/she need not disclose any such ties.  So, I always take non-peer-reviewed "scientific" discoveries and findings with a big bag of salt.
• Does the scientific claim seem plausible and common-sensical?  If the finding is from a famous scientific establishment with a long track record, and the finding has been peer-reviewed and found sound, then it is quite possible it is true and correct even if it sounds implausible at first (who would have believed that the earth revolved around the sun when it was clearly obvious that the earth was flat and the sun revolved around it from east to west?).  However, if the previous two filters raise questions about reliability, then the bar is pretty high for a claim to pass the smell test, as far as I am concerned.  So, when a former electrician "discovers" an amazing health supplement, and chooses to set up a website to tout it rather than publishing his "findings" in a good journal, it is time to move on!
Maybe, this post will help someone who is on the fence about science appreciate it for what it is.  Yes, it comes with warts, but it is still beautiful!  More importantly, maybe it will people appreciate science for what it is not:  it is not dogma.  It is not static and unchanging.  It is not magical or miraculous.  It makes no promises that it cannot keep.  And it is not evil any more than any other inanimate object in the universe such as electrons and protons, or stars and planets, are evil.  Most importantly, I hope it reduces the number of people who are unclear on the concept of science.  Religion is religion and science is science, and there is nothing that prevents anybody from being religious about certain aspects of their life and scientific about other aspects of it.

## Friday, March 9, 2012

### Microsoft Access Tips & Tricks: Identify Runs Of Data

In this post, I am going to develop an SQL query that will help you identify runs of data in your data tables. What do I mean by a "run of data"? Suppose you have a database in which you enter details of your child's little league soccer team. And each game has a game number, and an indication of whether your child's team won, tied or lost. A "run" in this case can be a set of consecutive wins, losses or ties.

Your data may contain multiple runs, and this SQL query will help you identify all those runs and any other details about those runs that you want. Then you can order them by their length, for instance, to identify the longest and shortest runs. Have you ever imagined how play by play commentators on TV are able to reel out statistics such as "this will be the 8th consecutive game in which player X has accomplished such and such against so and so, and his longest run of such accomplishments is 24 games", and so on? Identifying runs of data will give you the ability to pry such insights out of your data too!

If you are interested, you can find my earlier posts on finding the median, the mode, the geometric and harmonic means, ranking every row in a query, selecting random rows out of a table, calculating running sums and averages, calculating running differences, creating histograms, calculating probability masses out of given data, calculating cumulative distributions out of given data, finding percentile scores, percentile values, calculating distinct counts, full outer joins, parameter queries, crosstab queries, working with system objects, listing table fields,finding unmatched rows, calculating statistics with grouping, job-candidate matching, job-candidate matching with skill levels, great circle distances, great circle initial headings, using Excel functions in Access, using the windows file-picker, using the Access date-picker, setting tab indexes quickly and correctly, pre-filling forms based on previous entries, highlighting form controls, conditional formatting, performing date manipulations, counting letters, words, sentences and paragraphs, calculating age, propercasing text entries, flattening a table (using SQL), flattening a table (using VBA), cascading comboboxes, parsing file names, and opening files from within Access.

For the purposes of this post, I will assume that you have a table of data called MyTable, that contains weather data for your city. In particular, you have the observation date, the observed high temperature for that date, and the average high temperature for that date. Your task is to identify runs of dates that had temperatures higher than average.

To identify runs of data in VBA is relatively simple. You open a recordset with the data from your table, order it any way you want, then walk down the recordset figuring out whether each record is the beginning or end of a run that satisfies the given criteria. Then calculate the lengths of the runs using the identified beginning and end points of the runs. Doing it in SQL is more challenging, but also much more convenient when you want to do it quickly. Mucking around with VBA every time you want to change the criteria that define a run can get tedious and error-prone.

The SQL solution in this post is going to have three distinct parts: In the first part, we are going to identify the starting point of every run of data that satisfies the conditions we are looking for. In the second part, we are going to identify the ending point of every run of data. In the third part, we are going to combine the starting and ending points to define each run, so that you can identify interesting statistics such as the length of each run. Or you can order the runs to identify the longest runs, shortest runs, etc. Or you can group the statistics to find the average length of a run, median length of a run, etc.

Identifying the starting points of all runs of data: In this part, we are going to find out every row that starts a run of data that satisfies the condition we are looking for. In particular, referring back to our task for this post, we are going to identify every date on which the observed temperature is higher than the average temperature, and the observed temperature on the previous date was not higher than the average temperature.

The SQL to do this is conceptually quite simple. We self-join the table with itself and join each row with the previous row in the table. We can then use a where condition to find out which rows satisfy the criteria we are looking for while at the same time the previous row with which it is joined does not satisfy the criteria. The SQL for this is shown below:
`SELECT T1.[WeatherDate] as RunStartFROM MyTable T1 inner join MyTable T2on (T1.[WeatherDate] = T2.[WeatherDate] +1whereT1.[ObsTemp] > T1.[AvgTemp] AND NOT(T2.[ObsTemp] > T2.[AvgTemp])`
I have used NOT for identifying dates on which the condition is not satisfied because negating an entire condition using NOT is less error-prone than using other methods, especially when the condition is complicated, and contains a lot of AND's and/or OR's. In this case, that was not necessary, but I did it to be consistent so that I will do it this way when the condition is complicated without having to remember to do it one way now and a different way then.

Now, there is a problem with the SQL above. And that problem becomes apparent when there are gaps in the data. Suppose your table had data for the 31st of March, but not the 30th of March (for some reason that is irrelevant to this post). If the condition is satisfied on the 31st of March, that row is not identified as a RunStart because that row is not joined with any other row in the table.

The same problem happens even if there is no gap in the table. It also happens when the very first row of the table satisfies the condition. That first row is not joined with any other row in the table (because there does not exist a row with the previous date), so it is not identified as the start of a run. To overcome these problems, we modify the above SQL as below:
`SELECT T1.[WeatherDate] as RunStartFROM MyTable T1 left join MyTable T2on T1.[WeatherDate] = T2.[WeatherDate] + 1whereT1.[ObsTemp] > T1.[AvgTemp] AND(NOT(T2.[ObsTemp] > T2.[AvgTemp]) OR T2.[ObsTemp] is NULL OR T2.[AvgTemp] is NULL)`
Now, the left join makes sure that all dates without a previous date in the table are still present in the output. The previous date data in those rows is NULL, and we have decided that NULL data cannot satisfy the condition we are looking for. So, we identify all such dates as starts of runs as long as those dates satisfy the conditions we are looking for.

Identifying the ending points of all runs of data: In this part, we are going to find out every row that ends a run of data that satisfies the condition we are looking for. In particular, referring back to our task for this post, we are going to identify every date on which the observed temperature is higher than the average temperature, and the observed temperature on the next date was not higher than the average temperature.

Given the SQL for identifying the starting points of all the runs, it must be easy to modify that to identify the ends of all the runs. The modified SQL is below:
`SELECT T1.[WeatherDate] as RunEndFROM MyTable T1 left join MyTable T2on T1.[WeatherDate] = T2.[WeatherDate] - 1whereT1.[ObsTemp] > T1.[AvgTemp] AND(NOT(T2.[ObsTemp] > T2.[AvgTemp]) OR T2.[ObsTemp] is NULL OR T2.[AvgTemp] is NULL)`
Note that the join condition now says "T1.[WeatherDate] = T2.[WeatherDate] - 1". So, we are joining each row with the next row in the table, rather than the previous row. And the left join ensures that we identify ends of runs even when the next date for a particular date is not present in the table, or the last row of data in the table satisfies the condition.

Define all runs of data: Now we have come to the crucial third part of our task. We are going to use the queries above to uniquely identify each run of data and produce statistics about the run that include the starting date, the ending date and the length in days of each run. We are going to use some trickery to do this, though!

What we are going to do is use the SQL statements above as temporary tables. We are going to join the two temporary tables using the condition that the end of a run must always be greater than or equal to the start of a run (note that there can be single observation runs of data in which the start of the run is also the end of the run. That is why it is important to emphasize that the end of a run is greater than OR EQUAL TO the start of a run, not necessarily strictly greater than).

Obviously, this is going to result in every run start date being joined with every run end date that is greater than or equal to itself. To identify the true end date for a given start date, we take the minimum of all the run end dates that are joined with that run start date. So, we group by run start dates and pick the minimum of the run end dates. This grouping is the trickery that I refer to in the previous paragraph! Obviously, the minimum of the run end dates that is greater than or equal to a given run start date is the true end date for that run. Once you are convinced of that, the SQL below should not present you any problems.
`select RunStart as Start, MIN(RunEnd) as End, MIN(RunEnd) - RunStart + 1 as Length from(SELECT T1.[WeatherDate] as RunStartFROM MyTable T1 left join MyTable T2on T1.[WeatherDate] = T2.[WeatherDate] + 1whereT1.[ObsTemp] > T1.[AvgTemp] AND(NOT(T2.[ObsTemp] > T2.[AvgTemp]) OR T2.[ObsTemp] is NULL OR T2.[AvgTemp] is NULL)) as SQL1inner join(SELECT T1.[WeatherDate] as RunEndFROM MyTable T1 left join MyTable T2on T1.[WeatherDate] = T2.[WeatherDate] - 1whereT1.[ObsTemp] > T1.[AvgTemp] AND(NOT(T2.[ObsTemp] > T2.[AvgTemp]) OR T2.[ObsTemp] is NULL OR T2.[AvgTemp] is NULL)) as SQL2on SQL1.RunStart <= SQL2.RunEndgroup by SQL1.RunStart`
You can now order the results from this query by any fields you want. In particular, ordering by the start dates in ascending order will give you the earliest runs first, while ordering by the start dates in descending order will give you the latest runs first. Ordering by the length in ascending or descending order will enable you to identify the shortest or longest runs of data in the table respectively.

You can add other WHERE conditions in both queries to further restrict the runs that are identified. For instance, you can add conditions to only identify runs of above-average temperatures in winter, or in a particular year, or only when the minimum temperatures are below the averages, or whatever you want. If you have the data for it, you can identify the runs in it. Now you can be a celebrity weatherman at your parties, reeling off weather statistics that others can only wonder where you are getting from!

Hope this post has been helpful in solving any problems you might have had with identifying runs of data in Microsoft Access. If you have any problems or concerns with the SQL in this post, please feel free to let me know by posting a comment. If you have other questions on Access that you would like me to address in future posts , please feel free to let me know through your comments too. Good luck!

## Saturday, February 4, 2012

### It Is Time For A Numerical Puzzler This Time!

It is another week, so it must be time for another puzzler! I have gotten into this habit of getting stumped on a pretty regular basis by mathematical puzzles that my children bring home from school. And this time, it was a numerical puzzler brought home by my younger daughter. Let me share the puzzle with you first, and then I will tell you where I am having a problem with solving it.

A list of numbers starts with a two-digit number. The second number in the list is derived by adding the sum of the digits of this first number to the first number itself. The result is another 2-digit number. The third number in the list is then formed by adding the sum of the digits of the second number to the second number. The third number in the list is 44. What is the first number in the list?

At first glance, it seems pretty straightforward. In fact, when my daughter read it to me, my first thought was that the solution would end up being a system of simultaneous equations from which the answer would pop out. In fact, since my daughter is learning about simultaneous equations right now, that seemed a pretty safe guess. But I was quite mistaken!

I immediately started out formulating the problem as below:

Let 10x + y be the first number in the list. The second number in the list is, therefore, 10x + y + x + y. This can be simplified into 11x + 2y. Now, 11x + 2y is equal to a two-digit number also. Let us represent that number by 10a + b. Now, the third number in the series will be 10a + b + a + b. Thus, 11a + 2b. And it is given that 11a + 2b = 44.

Excellent so far. But looking closely at the system above, it is obvious that there are only 2 equations in 4 unknowns:

11x + 2y = 10a + b
11a + 2b = 44

Obviously, this system is not solvable by algebraic methods. And, it is unlikely to be the solution methodology envisioned by my daughter's teacher since her class is barely into solving simultaneous equations in 2 variables. I don't expect even my older daughter to solve a system of 4 simultaneous equations in 4 variables (assuming I would find two more equations in a, b, x and y hiding out there somewhere, waiting to be discovered). And to be perfectly honest, I don't look forward to solving such a system by hand either!

Obviously, I am missing something. But what? I can't think of anything intuitive that I have missed in the above analysis. But I must be.

By the way, it is easy enough to solve this problem by trial and error. Given that 11a + 2b = 44, and 0<=a<=9 and 0<=b<=9, it is easy that see that a can not be 1 or 2 (then b would have to be greater than 9 and therefore cannot be the digit of a number), and a can not be 3 because 33 + an even number (which is what 2*b is) can not be 44.

So, the second number is 40, and applying the logic backwards one more step with x and y, you get the first number in the list as 29. So, the list is 29, 40, 44, ...

But, being able to solve a problem by trial and error is never any fun. It never gives you the sense of satisfaction you get when you solve the problem correctly so that you derive a general formula for the answer that you can apply without having to stumble around in the dark. What if I had been given the 11th number in the list and asked to derive the first number of the list? That would be a lot of trial and error to work through to get to the final answer.

I must admit that I hate problems that involve the digits of a number. The basic problem is that the digits of a number are not at all obvious when the number is expressed as a polynomial unless the coefficients are all powers of 10.

Take the second number of this list, for instance. It can be expressed as 11x + 2y. But what exactly are the digits of this number? Is there a formula expressible in x and y that will tell me what the digits of the number are? In this case, 11x + 2y turned out to be the number 40. 4 is not even divisible by or a factor of 11. Similarly, 11a + 2b turned out to be 44. But in the final answer, b was not even equal to 2!

I guess the puzzle this time is: Is there a better method than guess and check to solve problems like this? What am I missing that is causing me not to be able to solve this puzzle (and others similar to this) algebraically?

Now, should I be disturbed by my inability to solve this problem algebraically given that I was able to solve it by trial and error? I think so. Let me pose a slightly different problem for you. A list starts with a certain two-digit number. The second number is list is derived by adding to that number 4 times its first digit (ten's digit) and subtracting 5 times the second digit (units digit). The result is another two-digit number. The third number is derived by adding to the second number 4 times the second number's tens digit and subtracting 5 times the second number's units digit. The process is continued on and on. The sixth number in the list is 82. What is the first number in the list?

You see the problem with the trial and error problem-solving technique now? What if I made the list start with a 3 or 4-digit number, and derived the subsequent numbers in the list using even more complicated mathematical manipulations? And then gave you the 75th number of the list and asked you for the 1st number? Or even gave you the 1st number and asked you for the 75th? Without an algebraic way of going back and forth down the list quickly, it would become pretty painful pretty quickly.

Now, I do understand that these types of problems are rather unique in one respect: Each of these problems is written based on using a particular base for the numbers (in this case, base 10), and can be solved only in that base. In most other mathematical problems, the base in which numbers are expressed is completely irrelevant. But any time the digits of a number enter the equation, the base in which the numbers are expressed becomes extremely important.

So, where do you think my mental block is? Is guess and check the best method for problems of this sort? Is there a better method I am not thinking of? Are problems which do not have unique solutions without reference to a particular base of operations doomed to trial and error methods of solution? How do I take the base of operations into consideration explicitly in that case when trying to derive an algebraic solution? Anybody willing to help me in this effort? Please feel free to chime in with comments on this post. Good luck!

## Friday, January 20, 2012

### I Have An Answer To My Question And A Solution To My Puzzle!

In my previous post, I talked about a puzzle that was triggered by a geometry problem that my daughter brought home from school. The puzzle is quite simple, and involves midpoint polygons. These are inscribed polygons that are derived by joining the midpoints of the sides of the exterior polygons. The puzzle was to find the ratio of the perimeter of the midpoint polygon to the perimeter of the outer polygon as a function of the number of sides of the polygon.Last week I stopped at hexagons because my techniques were not good enough to allow me to calculate the perimeter of a midpoint heptagon given a regular exterior heptagon of unit sides. But, I kept thinking about the puzzle on and off, and have now come up with a solution that should work for any regular polygon. In this post, I am going to explain how my method works, and present some results of my investigations.

The good news is that it is possible to express the ratio in terms of the number of sides for any regular polygon. The bad news is that the formula looks quite ugly and complicated. All the simple formulas like E=mc2 are already taken!

The secret to working out the ratio starts with drawing a diagonal that connects two alternate vertices of the exterior polygon. By alternate vertices, I mean two vertices that are separated by one intermediate vertex. In the figure to the left, consider segments AB and BC to be part of the exterior polygon (I have not drawn the rest of the polygon because I want to derive a formula that applies to polygons with any number of sides. So, my figure only shows two sides of a polygon with an arbitrary number of sides). They are equal to each other because the polygon is regular. Let us assume they are each 1 unit in length.

Segment AC is a diagonal that connects two alternate vertices of the exterior polygon (B is the intermediate vertex that is skipped over by the diagonal). ABC is now an isosceles triangle. Angle ABC is an interior angle of the polygon, and it is a function of the number of sides the polygon has. In particular, we know that angle ABC is equal to (n-2)*180/n degrees, where n is the number of sides of the regular polygon. In the figure, I have represented this angle by x.

We also know that angles BCA and BAC are equal, and because they are part of triangle ABC, we know that each is equal to (180–x)/2. Since x = (n-2)*180/n degrees, we can substitute that in the expression above, and get the measure of angles BAC and BCA to be 360/2n degrees. In the figure, I have represented this angle by y.

Surprisingly, that is all the information we need to get the ratio we are after. The first step in calculating the ratio is the calculation of the length of segment AC. Why? Because, we know from my previous post that each side of the midpoint polygon is half the length of the diagonal that connects alternate vertices of the exterior polygon.

So, how do we go about calculating the length of segment AC? We apply the law of sines to triangle ABC to calculate the length of segment AC. We know that angle x is opposite to the segment AC whose length we want, and angle y is opposite the segment AB whose length is 1. Thus, 1/sin(x) = AC/sin(y). Therefore, AC = sin(y)/sin(x).

At this point, we can express x and y in terms of n, and we will have AC in terms of n. Divide the length of AC by 2, and that is the ratio we are after (the length of one of the sides of the midpoint polygon). Doing these substitutions, we have:

AC = sin(360/2n)/sin((n-2)*180/n)

Therefore, the ratio of the perimeter of the midpoint polygon to the perimeter of the regular exterior polygon (which is half of AC) would be:

R = sin(360/2n)/(2*sin((n-2)*180/n))

Well, it may not be pretty, or roll off the tongue quite like E=mc2, but it is a function of n, the number of sides of the polygon, and that is what we started out seeking. So, pretty or not, we have achieved what we wanted to achieve, and anyway, half the fun of getting there is the traveling itself, right? So what if the destination was not stunning, the route was plenty scenic, right?!

Now, how do I know this is correct? Below is a table with the value of R calculated for regular polygons with different number of sides. The first 3 lines correspond to polygons that we dealt with in the previous post (squares, pentagons and hexagons). I had to start with squares because triangles do not have diagonals (unless you consider each side to be a diagonal also). As you can see, the value of the ratio as calculated using the above formula is identical to the value of the ratio which we derived using other methods in the previous post.

 Number of Sides (n) Interior Angle ABC (x) (in degrees) Angle with Diagonal (y) (in degrees) Length of Diagonal AC Ratio of Perimeters (R) 4 90 45 1.414213562 0.707106781 5 108 36 1.618033989 0.809016994 6 120 30 1.732050808 0.866025404 7 128.5714286 25.71428571 1.801937736 0.900968868 8 135 22.5 1.847759065 0.923879533 9 140 20 1.879385242 0.939692621 10 144 18 1.902113033 0.951056516 100 176.4 1.8 1.999013121 0.99950656 1,000 179.64 0.18 1.99999013 0.999995065 10,000 179.964 0.018 1.999999901 0.999999951 1,000,000 179.99964 0.00018 2 1

It is interesting to note that, as predicted, the ratio does approach 1 as the number of sides goes up. I have added rows for 100, 1000, 10000 and 1000000 sides to the table above just to illustrate how the ratio gets closer to 1 as the number of sides goes up. There are so many 9's in the answers for the polygon with a million sides that I decided to do away with them, and put down the answer rounded to a billionth!

By the way, a polygon with 100 sides is called a hectogon. A polygon with 1,000 sides is called a chiliagon, one with 10,000 sides is called a myriagon, and one with a million (1,000,000) sides is called a megagon!

Since we know that the ratio of the area of an inscribed midpoint polygon to the area of the exterior regular polygon is the square of the ratio of perimeters, you can just take the numbers from the table in this post and square them to get the ratios for the areas without having to explicitly calculate the areas of either the exterior polygons or the inscribed midpoint polygons.

There exist formulas for the area of a regular polygon based on the number of sides, and the length of each side, so if you are curious, you can calculate the area of the exterior polygon using such a formula, calculate the area of the midpoint polygon using the ratio in the table to compute the length of each side, and then calculate the ratio of the areas. The amazing thing about mathematics is that you can derive general results (such as: the ratio of areas is the square of the ratio of the perimeters), and after that you don't have to do all the hard work involved in calculating specific results. But the tools exist to do so if you want to.

Well, I consider this another example of using very simple, well-known concepts, and stringing them together in just the right order to derive something a little more interesting and complicated. It is what makes and keeps mathematics so interesting and challenging. And hopefully, it will keep our brains young and fresh! Good luck in your mathematical explorations!!

## Thursday, January 12, 2012

### An Interesting Geometrical Puzzle

My daughter recently brought home a puzzle that had me scratching my head for a while before being able to solve it. Actually, the solution was quite simple once I figured it out, but it spawned off a different puzzle in my head that may be a little trickier. In any case, I have not figured out a solution to the spawned-off puzzle yet, but let me not get ahead of myself. Let me first tell you about the puzzle that my daughter brought home.

You are given a regular pentagon, ABCDE (regular in this context means that all the sides are of equal length, and all the interior angles are equal to 108 degrees). You are also given the fact that the diagonal AC of this pentagon is 10 units long. Please refer to the figure on the left.

Now, inside this pentagon is another pentagon formed by joining the midpoints of each of the sides of the original pentagon. Call this inscribed pentagon FGHIJ. The puzzle is to use all this information to find the perimeter of the pentagon FGHIJ.

Initially, for some reason, I thought the solution would consist of finding the perimeter of ABCDE first and then somehow using that along with the length of AC to derive the perimeter of FGHIJ. That is what had me floundering for a few minutes. In fact, I even looked up the Wikipedia article on pentagons for some inspiration.

But the solution is actually much simpler (obviously, the fact that this puzzle was given to a middle-school student as an assignment is a big hint that this is not the geometric equivalent of Fermat's last theorem!). It is simply sufficient to realize that the diagonal creates a triangle (ABC in this case), and the line segment FG connects the midpoints of two of the sides of this triangle.

You can then apply the midpoint theorem on this triangle to conclude that side FG must be equal to 5 units in length because AC is 10 units long. And because FGHIJ is a regular pentagon, being the midpoint pentagon of a regular pentagon, its perimeter must be 5*5 = 25 units.

But, obviously, half the fun of solving a mathematical problem is coming up with related puzzles and questions. In particular, the puzzle I am now stuck with is as follows: what is the ratio of perimeters of midpoint polygons to the perimeters of the polygons they are inscribed in as a function of the number of sides of the polygon? Similarly, what is the ratio of areas of midpoint polygons to the perimeters of the polygons they are inscribed in as a function of the number of sides of the polygon?

Note that the polygons formed by joining the midpoints of the sides of a given polygon are called midpoint polygons, and have been studied by mathematicians in some detail. In particular midpoint polygons constructed from regular polygons are also regular, and are geometrically similar to the exterior polygon.

Let us take a regular triangle (also known as an equilateral triangle) first. Connecting the midpoints of the three sides to create an inscribed midpoint triangle inside the original triangle gives us a triangle with half the perimeter as the original. Note that in the figure below, triangle ADF is also equilateral, and since AD and AF are one half of the length of AB, DF is one-half of the length of AB. And since DE, EF and DF have the same length (each equal to half the length of the sides of the original triangle ABC), the new triangle has half the perimeter as the original triangle.

Also note that the original triangle ABC has been divided into 4 equilateral triangles of equal area by the addition of the inscribed triangle DEF. Thus, the area of triangle DEF is one quarter of the area of the original triangle ABC. Thus, the ratio of perimeters is 1/2, and the ratio of areas is 1/4.

Moving on to a square, which is a regular quadrilateral, we can use the pythagorean theorem on triangle EBF to figure out that the length of EF is equal to (length of AB)/sqrt(2). And the area of EFGH is one-half of the area of ABCD. Thus, the ratio of perimeters is 1/sqrt(2), and the ratio of areas is 1/2.

The situation becomes more complicated in the case of a pentagon. In fact, it is not easy to figure out what the length of each side of an inscribed pentagon is if you know just the lengths of the sides of the original pentagon. But my research into regular pentagons on the Wikipedia site did have a positive side-effect: it turns out that the diagonals of a regular pentagon are in golden ratio to its side. In other words, if the length of a side is 1 unit, the length of a diagonal is eaual to the golden ratio. The value of golden ratio is 1.6180339887498948482045868343656.

From my solution of the puzzle which started this entire exploration, we also know that the length of each side of the inscribed pentagon is one half of the length of the diagonal. Thus, if the length of each original side is 1, then each diagonal is 1.6180339887498948482045868343656 units long, and the length of each inscribed side is 1.6180339887498948482045868343656/2 units long. Thus, the ratio of perimeters is 1.6180339887498948482045868343656/2.

The same Wikipedia article also tells me that the area of a regular pentagon with sides of length t is 1.720477401*t^2. Thus, the area of the inscribed pentagon would be 1.720477401*(1.6180339887498948482045868343656*t/2)^2. What we are interested in is the ratio of the areas, not the actual areas themselves. And that would simply be (1.6180339887498948482045868343656/2)^2.

What about a hexagon? Well, the analysis is a little more complicated in the case of a hexagon. First of all there are two formulas for the area of a hexagon that we can use to derive the length of a diagonal of the hexagon that connects two alternate vertices of the hexagon (note that a hexagon also has three diagonals that connect opposite vertices, and these diagonals are twice the length of each side. They are also longer than the diagonals we are interested in). You can find these formulas in the wikipedia article on hexagons.

The first formula for area is: area = 2.598076211*t^2. The second formula for area is: area = 1.5*d*t. Interestingly, d is the length of the diagonal connecting alternate vertices of the hexagon (it is the height of the hexagon when it is resting on one of the sides as its base). Using these two formulas, since the area of a hexagon is a single number, we can say that 2.598076211*t^2 = 1.5*d*t. From this, we can derive the value of d to be 2.598076211*t/1.5.

At this point, we can then use the midpoint theorem to say that each side of an inscribed hexagon would be one-half the length of each of these diagonals. Thus the ratio of perimeters would be 2.598076211/3. And the ratio of areas would be the square of that number since the area is directly proportional to the square of the length of a side.

So, that is as far as I have gotten so far. The area formula for heptagons does not allow me to calculate the length of a diagonal connecting alternate vertices of a heptagon. So, I can not calculate the length of a side of an inscribed midpoint heptagon either. You could say, I am stuck!

In table form, one can express this as below:

 Number of Sides Ratio of perimeter of midpoint polygon to perimeter of exterior polygon Ratio of area of midpoint polygon to area of exterior polygon 3 0.50 0.25 4 0.70710678118654752440084436210485 0.50 5 0.8090169943749474241022934171828 0.65450849718747371205114670859138 6 0.86602540366666666666666666666667 0.75

Pretty basic observations follow from this.
• For regular polygons, area seems to be always proportional to the square of the length of a side. Thus, the ratio of areas of inscribed midpoint polygons to exterior polygons would be the square of the ratio of perimeters.
• The ratio of perimeters (and thus the areas) approaches 1, and will be 1 in the asymptotic case of a polygon with infinite number of sides (one could argue that the inscribed midpoint circle of a given circle is coincident with the given circle, and thus the ratio of perimeters as well as areas is indeed 1).
• Since the ratio of areas is the square of the ratio of perimeters, and the ratio of perimeters is less than 1.0, the ratio of areas is always less than the ratio of perimeters.

These observations though, do not tell me the answer to the question I started with: what is the ratio of the perimeter of an inscribed midpoint polygon to the perimeter of the exterior polygon? Is it possible to find an expression that tells me the value of this ratio for any type of polygon (obviously, such an expression would be a function of the number of sides of the polygon)? Any ideas? I would love to hear your thoughts on how to proceed with this investigation. Thank you, and good luck!

## Wednesday, January 11, 2012

### Anti-Reflective Coatings Have Come A Long Way

Anti-reflective coatings are used on optical lenses to increase the collecting of light, and to prevent stray light from bouncing around inside systems with multiple lenses. They have been around for a while now, having been invented in Germany in the 1930's. Anti-reflective coatings on corrective glasses make them look better, and also reduce glare, especially in high-contrast situations like night-time driving.

When I bought my daughter's latest pair of eyeglasses, I decided to go with Zenni's latest anti-reflective coating on the lenses. This coating is the latest generation, and is technically advanced. Not only does it reduce glare and make the glasses less likely to stand out in photographs, but it also boasts several advanced features.

This new generation of AR coatings is hydrophobic and oleophobic. This means that the coating repels not only water, but also oils. Repelling oil is significant because most smudges, fingerprints, etc., on lenses are caused by the oils on your hands and fingers. By repelling most oils, the oleophobic lenses stay cleaner longer, and are much easier to clean.

The differences between her old pair of glasses and the new pair of glasses has made a believer out of me as far as this coating is concerned. My daughter gets crystal clear vision without having to deal with glare from table lights, her computer screen, overhead lights, etc. Since more light passes through the lenses instead of being reflected away, things appear brighter, and with better contrast. The lenses have been much easier to clean and keep clear of dirt, smudges and fingerprints. And, best of all, she likes the fact that she does not have remove her glasses before posing for photographs!

## Content From TheFreeDictionary.com

In the News

Article of the Day

This Day in History

Today's Birthday

Quote of the Day

Word of the Day

Match Up
 Match each word in the left column with its synonym on the right. When finished, click Answer to see the results. Good luck!

Hangman

Spelling Bee
difficulty level:
score: -