H2 Mathematics Textbook (Choo Yan Min).pdf

Viewer
Transcript

H2 Mathematics Textbook CHOO YAN MIN

& Answers. Covers both 9740 & 9758 syllabuses. Includes TYS

This version: September 28, 2017. The latest version will always be at this link.

This book is optimised for viewing in PDF format (click the above link). Other existing formats are crude conversions and may be sub-optimal.

This textbook was first completed in July 2016. Since then, only small changes (usually corrections of typos) have been made.

Page 2, Table of Contents

www.EconsPhDTutor.com

, Errors? Feedback? Email me! , With your help, I plan to keep improving this textbook.

Page 3, Table of Contents

www.EconsPhDTutor.com

Please do not be intimidated by the length of this book (~1,300 pages). The actual main content takes up only about 700+ pages. The other 600 pages are for things like front matter, TYS questions, appendices, reproductions of formula lists and syllabuses, and answers to exercises.

Page 4, Table of Contents

www.EconsPhDTutor.com

This book is licensed under the Creative Commons license CC-BY-NC-SA 4.0.

You are free to: • Share — copy and redistribute the material in any medium or format • Adapt — remix, transform, and build upon the material The licensor cannot revoke these freedoms as long as you follow the license terms.

Under the following terms: • Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use. • NonCommercial — You may not use the material for commercial purposes. • ShareAlike — If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original. • No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.

Notices: You do not have to comply with the license for elements of the material in the public domain or where your use is permitted by an applicable exception or limitation. No warranties are given. The license may not give you all of the permissions necessary for your intended use. For example, other rights such as publicity, privacy, or moral rights may limit how you use the material. Author: Choo, Yan Min. Title: H2 Mathematics Textbook. ISBN: 978-981-11-0383-4 (e-book). Page 5, Table of Contents

www.EconsPhDTutor.com

The first thing to understand is that mathematics is an art. Paul Lockhart (2009, A Mathematician’s Lament, p. 22).

A mathematician, like a painter or a poet, is a maker of patterns. If his patterns are more permanent than theirs, it is because they are made with ideas. ... Beauty is the first test: there is no permanent place in the world for ugly mathematics. - G.H. Hardy (1940 [1967], A Mathematician’s Apology, pp. 84-85).

The scientist does not study nature because it is useful to do so. He studies it because he takes pleasure in it, and he takes pleasure in it because it is beautiful. - Henri Poincaré (1908 [1914], Science and Method, English trans., p. 22).

Page 6, Table of Contents

www.EconsPhDTutor.com

About This Book This textbook is for Singaporean H2 Maths students (hence the occasional Singlish and TLAs1 ). Of course, I hope that anyone else in the world will also find this useful! I needed a definitive reference for my own teaching needs, but could find nothing satisfying. So I decided to just write my own textbook. This textbook is based exactly on the old (9740) and revised (9758) syllabuses. Do check to make sure which exam you’re taking. The revised syllabus (9758) is the same as the old syllabus (9740), but with noticeable chunks excised and is thus easier.2 9740 (old) examined? 9758 (revised) examined? 2016 Yes. No. 2017 Yes, for the last time. Yes, for the first time. 2018 No. Yes. SYLLABUS ALERT Where there are any differences between the old and revised syllabuses, I’ll let you know with a yellow box like this. • FREE! This book is free. But if you paid any money for it, I certainly hope your money is going to me! This book is free because: 1. It is a shameless advertising vehicle for my awesome tutoring services. 2. The marginal cost of reproducing this book is zero. • DONATE! This book may be free, but donations are more than welcome! Donation methods in footnote.3 It’s irrational for Homo economicus to donate. But please consider donating because: 1. You’re a nice human being , [*emotional_manipulation*]. 2. Your donations will encourage me and others to continue producing awesome free content for the world. 1

Three Letter Abbreviations. Indeed, some chunks of the old syllabus (9740) have simply been moved into the syllabus of Further Maths (9649), which the authorities have kindly resurrected for the 2017 exam season. 3 Singapore POSB Savings Account 174052271; OCBC Savings Account 5523016383 (Name: Choo Yan Min). Bitcoin wallet: 1GDGNAdGZhEq9pz2SaoAdLb1uu34LFwViz. Paypal [email protected] (Name: Yan Min Choo, USD preferred because this account was set up in the US). USA. Venmo link (Name: Yanmin Choo). Or buy a copy of my book on Amazon. Not exactly the best way to support me, because Amazon eats nearly half the price you pay. But hey if you have an Amazon account set up and it’s more convenient for you than any of the preceding donation methods, go ahead and buy 100 copies. , 2

Page 7, Table of Contents

www.EconsPhDTutor.com

• HELP ME IMPROVE THIS BOOK! Feel free to email me if: 1. There are any errors in this book. Please let me know even if it’s something as trivial as a spelling mistake or a grammatical error. 2. You have absolutely any suggestions for improvement. 3. Any part of this book is less than crystal clear. Here’s an anecdote about Richard Feynman, the great teacher and physicist: Feynman was once asked by a Caltech faculty member to explain why spin 1/2 particles obey Fermi-Dirac statistics. He gauged his audience perfectly and said, “I’ll prepare a freshman lecture on it.” But a few days later he returned and said, “You know, I couldn’t do it. I couldn’t reduce it to the freshman level. That means we really don’t understand it.” I agree: If you can’t explain something simply, you don’t understand it well enough.4 Corollary: An excellent test of whether you understand something is to see if you can explain it simply to someone else. If at any point in this textbook, you have read the same passage a few times, tried to reason it through, and still find things confusing, then it is a failure on MY part. Please let me know and I will try to rewrite it so that it’s clearer. (There is also the possibility that I simply messed up! So please let me know if there’s anything confusing!) I deeply value any feedback, because I’d like to keep improving this textbook for the benefit of everyone! I am very grateful to all the kind folks who’ve already written in, allowing me to rid this book of more than a few embarrassing errors. • LyX rocks! This book was written using LYX.5 • Is the font size big enough? You’re probably reading this on some device. So I’ve tried to set the font sizes and stuff so that one can comfortably read this on a device as small as a seven-inch tablet. It should also be possible to read this on a phone, though somewhat less comfortably. (Please let me know if you have any feedback about this!) (I’ll probably be contacting some publishers to see if they want to do a print version of this, for anyone who prefers it in print.) 4

This quote or some similar variant is often (mis)attributed to Einstein. But as Einstein himself once said, “73% of Einstein quotes are misattributed.” More recently, another intellectual who said something close to this was Peter Singer: “whatever cannot be said clearly is probably not being thought clearly either.” 5 A L TEX is the typesetting program used by most economists and scientists. But LATEX can be difficult to use. LYX is a user-friendly GUI version of LATEX. LYX has boosted my productivity by countless hours over the years and you should use LYX too!

Page 8, Table of Contents

www.EconsPhDTutor.com

Tips for the Student • Read maths slowly. Reading maths is not like reading Harry Potter. Most of Harry Potter is fluff. There is little fluff in maths. So go slowly. Dwell upon and carefully consider every sentence in this textbook. Make sure you completely understand what each statement says and why it is true. Reading maths is very different from reading any other subject matter. If you don’t quite understand some material, you might be tempted to move forward anyway. Don’t. In maths, later material usually builds on earlier material. So if you simply move forward, this will usually cost you more time and frustration in the long run. Better then to stop right there. Keep working on it until you “get” it. Ask a friend or a teacher for help. Feel free to even email me! (I’m always interested to know what the common points of confusion are and how I can better clear them up.) • Examples and exercises are your best friends. So work through them. A good stock of examples, as large as possible, is indispensable for a thorough understanding of any concept, and when I want to learn something new, I make it my first job to build one. - Paul Halmos (1983, Google Books). Work through all the examples and exercises. Merely moving your eyeballs is not the same as working. Working means having pencil and paper by your side and going through each example/exercise word-by-word, line-by-line. For example, I might say something like “x2 − y 2 = 0. Thus, (x − y)(x + y) = 0.” If it’s not obvious to you why the first sentence implies the second, stop right there and work on it until you understand why. Don’t just let your eyeballs fly over these sentences and pretend that your brain is “getting” it. I will often not bother to explain some steps, especially if they simply involve some simple algebra. • You get a List of Formulae during the A-level exam. So there’s no need to memorise all the formulae that are already on the list you’re getting. Note that you get a different list depending on which exam you’re taking — List of Formulae (MF15) for the old 9740 exam and List of Formulae (MF26) for the revised 9758 exam. I cannot guarantee though that your JC will give you the List during your JC common tests and exams. Page 9, Table of Contents

www.EconsPhDTutor.com

• Remember your O-Level Maths & ‘A’ Maths? You’ve probably forgotten some (or most?) of it, but unfortunately, you are still assumed to know EVERYTHING from O-Level Maths & ‘A’ Maths. (To take H2 Maths, most JCs require that you at least passed ‘A’ Maths.6 ) See in particular the lists near the end of either the 9740 (old) or the 9758 (revised) syllabus. Skim through and see if anything looks totally alien to you! Some chapters (e.g. Chapters 5 and 26) in this textbook will give a quick review of some of the O-Level Maths material that you may have forgotten but which we’ll use quite often. • Online Calculators Google is probably the quickest for simple calculations. Type in anything into your browser’s Google search bar and the answer will instantly show up:

Wolfram Alpha is somewhat more advanced (but also slower). Enter “sin x” for example and you’ll get graphs, the derivative, the indefinite integral, the Maclaurin series, and a bunch of other stuff you neither know nor care about. The Derivative Calculator and the Integral Calculator are probably unbeatable for the specific purposes of differentiation and integration. Both give step-by-step solutions for anything you want to differentiate or integrate. Here is a collection of spreadsheets I made. These spreadsheets are for doing tedious and repetitive calculations you’ll often encounter in H2 maths (e.g. with vectors, complex numbers, etc.). As with anything I do, I welcome any feedback you may have about these spreadsheets. Perhaps in the future I will make a more attractive version of it. (Instructions: Click “Make a copy” to open up your own independent copy of this spreadsheet. Enter your input in the yellow cells. Output is produced in the blue cells. If you mess up anything, simply click the same link and “Make a copy” again.)

6

Some JCs, like HCI, even require that you got at least a B3 for both Maths & ‘A’ Maths.

Page 10, Table of Contents

www.EconsPhDTutor.com

• Other Online Resources There are way too many websites out there catering to primary, secondary, and lower-level undergraduate maths. Unfortunately, some of them can be awful and can get things wrong. Three resources I like (though are probably a bit advanced for JC students) are: 1. Math StackExchange A great resource where you can ask maths questions and often get them answered fairly promptly. Note though that this site is mostly frequented by fairly advanced students of maths (not to mention also mathematicians), so they can be pretty impatient and quick to downvote questions they perceive to be “stupid”. Nonetheless, if you make an effort to write down a carefully-crafted question and show also that you’ve made some effort to look for an answer (either on your own or online), they can be very helpful.7 2. ProofWiki gives succinct and rigorous definitions and proofs. Unfortunately it is very incomplete. 3. Mathworld.Wolfram is also great, but at times excessively encyclopaedic, at the cost of clarity and brevity.

And of course, you can find countless free maths textbooks online (some less legal than others). Two totally illegal8 resources are: LibGen for books and SciHub for articles.9 An old reliable is BitTorrent.

7

There is an entire StackExchange family of websites. The flagship site is StackOverflow where you can ask any programming question and get it answered amazingly quickly. 8 Well, depending on which jurisdiction you live in. Of course, in Singapore, unless told otherwise, you should assume that everything is illegal. 9 Note though that these sites are constantly playing whac-a-mole with the fascist authorities so the URLs often change — if so, simply google to look up the current URLs.

Page 11, Table of Contents

www.EconsPhDTutor.com

Use of Graphing Calculators You are required to know how to use a graphing calculator.10 This textbook will give only a very few examples involving graphing calculators. There is no better way of learning to use it than to play around with it yourself. By the time you sit down for your A-level exams, you should have had plenty of practice with it. You can also use any of the seven calculators in the list below (last updated by SEAB on March 1st, 2016, PDF). But this textbook will stick with the TI-84 PLUS Silver Edition (which I’ll simply call the TI84). (My understanding is that most students use a TI calculator and that the five approved TI calculators are pretty similar.) I’ll always start each example with the calculator freshly reset.

10

Pretty bizarre that in this age of the smartphone, they want you to learn how to use these clunky and now-useless devices from the ’80s and ’90s. It is the equivalent of learning to program a VCR. IMHO it’d be much better to teach you to some simple programming or Excel (or whatever spreadsheet program). “B-bbut ... how would such learning be tested in an exam format?” Ay, there’s the rub. In the Singapore education system, anything that cannot be “examified” is not worth learning.

Page 12, Table of Contents

www.EconsPhDTutor.com

Preface (This preface is largely an unprofessional, blog-like, meandering rant. It is a means of releasing some frustrations pent up during the 5 months I spent looking at the A-level syllabus and writing this textbook. It is in large part a critique of the Singapore education system. I hope it’s entertaining and, at the same time, taken seriously.) Divide students into two extremes: 1. Type #1 is happy to get an A, even if this means learning absolutely nothing. 2. Type #2 would rather learn a lot, even if this means getting a C. The good Singaporean is taught that pragmatism is the highest virtue (and obedience second). She is thus also trained to be a Type #1 student (and indeed a Type #1 human being). If you’re a Type #1 student, then this textbook may not be the best use of your time (though you may still find the TYS and answers useful). Please use instead these resources, which are provided with the efficient Type #1 student in mind: • The H2 Mathematics CheatSheet, which contains all the formulae you’ll ever need on two sides of a single A4 sheet of paper.11 • My H1 Mathematics Textbook, which is written simply, and which covers a subset of the H2 syllabus. • The H2 Maths Exercise Book (coming soon), which teaches you how to mindlessly apply formulae and give the “correct” answer to every exam question. • My totally awesome tuition classes! Of course, it is fully intended that this textbook (complemented by a capable teacher) will help any student get her A. But, as I now explain, getting an A is quite beside the point of this textbook.

Even Type #1 student may find this textbook pragmatic, provided she plans to go on and do more maths (this includes physics, economics, engineering). Let me explain. Over the years, the gahmen has made half-hearted attempts to magically transform testtaking drones into creative innovators (usually involving silly four-letter campaigns like TSLN 1997, TLLM 2005). Nonetheless, school administrators, teachers, and students alike remain completely fixated with exams. And who can blame them, given the way the game is set up? 11

Two things: (1) This CheatSheet does not include many of the formulae already printed in List MF26. (2) It is written for 9758 (revised) students (so 9740 students may find a few things missing).

Page 13, Table of Contents

www.EconsPhDTutor.com

There is a place for testing. The problem is that as currently constructed, the exams do not test for genuine understanding. Instead, they merely test for whether you’ve mastered the art of mimicry and how to follow instructions, recipes, and algorithms. In other words, they test whether you are an obedient monkey capable of performing tricks you’ve practiced over and over and over again. I have a study in mind: Gather all the students who got As for their A-level maths exams, 5, 10, 15, 20 years after the exam. Ask them to do the exact same A-level exam that they took so many years ago. (I suspect most will get an F, if not a 0.) Ask them if they remember anything of their JC maths education. (I suspect most of them will remember absolutely nothing.) Ask them if their JC maths education had any value whatsoever. (I suspect most of them will say no.) If my suspicions are correct, then JC maths education is completely worthless. (Except of course as a selection device, which in elitist Singapore is of paramount importance. Grades help to sort the President’s Scholar from the mere PSC scholar; and the lowly McDonald’s employee from the dalit garbage man.) If you intend to do more maths in the future, then merely doing well in the A-level maths exams may give you the false illusion that you’ve mastered the material. Down the road, this will cost you more time, in terms of being confused and having difficulty grasping more advanced material in the future. Better then to spend a little bit more time to actually understand the material while you’re learning it right now. Personal anecdote: When I began my undergraduate studies (in the US), I was still a typical kiasu Singaporean who believed life to be a competitive race. And so I skipped a whole bunch of first- and second-year maths classes (Calculus I, Calculus II, Statistics, and Linear Algebra), thinking I had already covered all the material back in JC. On paper, I may indeed have covered all this material. But in practice, all I’d learnt in JC was monkey see, monkey do. I didn’t actually understand anything. It was only many years later, with the benefit of hindsight, that I began to see how much of a mistake I had made. In the long-run, skipping those classes — which was meant to save me time and put me ahead of the race — actually cost me more lost time. I would have saved more time if I had simply taken those skipped classes. The goal of this textbook is to impart genuine understanding — or at least as much as is possible, within the stultifying confines of the A-level syllabus.12 Such genuine understanding has intrinsic value. But it will also save you time in the long-run, if you go on to use more maths in the future. In A Mathematician’s Lament, Paul Lockhart describes (pre-tertiary) maths education in the US as being “stupid and boring”, “formulaic”, and “mindless” “pseudo-maths”. The same may be said of maths education in Singapore. But at least the average US student has the consolation that only a very small portion of her life will have been squandered on such “mindless” “pseudo-maths”. 12

One of my quixotic long-term ambitions is to change the Singapore A-level maths syllabuses. For now, this textbook shall be my contribution.

Page 14, Table of Contents

www.EconsPhDTutor.com

The same cannot be said for the average student in Singapore (or other similar East Asian societies). By the time she turns 18, she will have — just for the subject of maths alone — clocked many thousands of hours attending classes; doing homework; doing practice exam questions; doing assessment books, Ten Year Series; going to tuition classes; taking common tests, promos, prelims, one big exam after another; etc. In the US, “Singapore Math” has acquired something of a mythical status.13 As with weight loss, Americans are constantly on the lookout for some magic, painless solution to their mediocre (pre-tertiary) education systems. But for me there is no mystery — Singaporean kids are simply forced to work their butts off. While American teenagers are “wasting” their time on typical, “useless” teenager-ly pursuits, our teenagers are seated obediently in front of their desks, doing yet another soul-crushing TYS question. Singapore produces world champion exam-takers, but will never produce a Fields Medallist or a Nobel Laureate (at least in the foreseeable future).14 This textbook is for the Singaporean A-level student. So a good deal of “mindless formulae” is unavoidable. But at the same time, I try in this textbook to give the student a tiny glimpse of what maths really is — “the art of explanation”.15 I try to plant a thoughtcrime in the student’s mind: Maths is not merely another pain to be endured, but can at times be a joy. And so for example, this textbook explains • A bit of intuition behind differentiation, integration, and the Fundamental Theorems of Calculus. (To get an A, no understanding of these is necessary. Instead, one need merely know how to “do” differentiation and integration problems.) • Why the Central Limit Theorem is so amazing. (To get an A, one need merely treat the CLT as yet another mysterious mathematical trick that helps solve exam questions. No appreciation of why it is wonderful is necessary.) • A bit of intuition behind the Maclaurin series. (To get an A, it suffices to know how to mindlessly apply this strange formula that falls out from the sky.) • Why it is terribly wrong to believe that “a high correlation coefficient means a good model”. (Yet this is exactly what the writers of the A-level exams seem to believe. See Section 73.9.) Another personal anecdote: While in JC, I remember being deeply mystified by why the scalar (or dot) product, despite having a simple algebraic definition, could at the same time also tell us about the cosine of the angle between the two vectors. I never figured it out,16 but it didn’t matter, because this was simply “yet another formula” that we were required to know, for the sole purpose of answering exam questions. 13

In the 2012 PISA, the top countries were, in descending order, Singapore, Hong Kong, Taiwan, Korea, Macao, Japan, Liechtenstein, Netherlands, Estonia, Finland. Source: PISA 2012 Results in Focus, p. 10. 14 I am currently accepting bets for this proposition: “By 2050, no born-and-bred Singaporean will have won a Fields medal or a Nobel Prize (Peace Prize excluded).” We’ll need to work out what exactly “born-and-bred” means, but that can be ironed out. 15 Lockhart, p. 29. 16 I remember complaining about this to a classmate and his response was “But that’s how we’ve always been taught maths what. It’s just a bunch of formula.” He was probably right. Today, of course, the intellectually curious student can easily find the answer on the internet. But at that time, the internet was not quite so well-developed, so one could not easily find answers online.

Page 15, Table of Contents

www.EconsPhDTutor.com

I remember being confused about the difference between the sample mean, the mean of the sample mean, the variance of the sample mean, and the sample variance. But this confusion didn’t matter, because once again, all we needed to do to get an A was to mindlessly apply formulae and algorithms. Monkey see, monkey do. This textbook is thus partly in response to my unhappy and unsatisfactory experience as a maths student in Singapore. Almost all results are proven. I often try to supply the intuition for each result in the simplest possible terms. Many proofs are relegated to the appendices, but where a proof is especially simple and beautiful, I encourage the student to savour it by leaving it in the main text. In the rare instances where proofs are entirely omitted from this book — usually because they are too advanced — I make sure to clearly state so, lest the student wonder whether the result is supposed to be obvious. Finally, I also hope that this textbook will serve as an authoritative resource to which teachers and students alike can refer.

This textbook is far from perfect. To quote the motto of a certain neighbourhood secondary school, the best is yet to be. I hope that with your help, this textbook will be continuously improved. So if you have any feedback or spot any errors, please feel free to email me. As you can tell, I am pretty merciless about criticising others. So please don’t be shy about pointing out to me the many mistakes that are surely still lurking in this textbook.

Page 16, Table of Contents

www.EconsPhDTutor.com

Contents About This Book

7

Tips for the Student

9

Use of Graphing Calculators

12

Preface

13

I

37

Functions and Graphs

1 Sets

38

1.2

In ∈ and Not In ∉ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

40

1.3

Greater than >, Less Than <, Positive > 0, and Negative < 0 . . . . . . . . . . Types of Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

41

1.4

The Order of the Elements Doesn’t Matter . . . . . . . . . . . . . . . . . . . .

42

1.5

Repeated Elements Don’t Count . . . . . . . . . . . . . . . . . . . . . . . . . .

43

1.6

Ellipsis . . . Means Continue in the Obvious Fashion . . . . . . . . . . . . . . .

44

1.7

Sets can be Finite or Infinite . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

45

1.8

Special Names of Sets: Z, Q, R, and C . . . . . . . . . . . . . . . . . . . . . . .

46

1.9

Special Names of Sets: Intervals

. . . . . . . . . . . . . . . . . . . . . . . . . .

47

1.10 Special Names of Sets: The Empty Set ∅ . . . . . . . . . . . . . . . . . . . . .

48

1.1

1.11 Subset Of ⊆ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.12 Proper Subset Of ⊂ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.13 Union ∪ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.14 Intersection ∩ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

39

49 50 51 52

1.15 Set Minus / . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

54

1.17 Set-Builder Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

55

1.16 Set Complement A′ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Page 17, Table of Contents

53

www.EconsPhDTutor.com

2 Dividing By Zero

56

3 Functions

57

3.1 3.2

Formal Mathematical Notation for Functions

. . . . . . . . . . . . . . . . . .

EVERY x ∈ D Must be Mapped to EXACTLY ONE y ∈ C . . . . . . . . . . .

59 63

3.3

Real-Valued Functions of a Real Variable . . . . . . . . . . . . . . . . . . . . .

65

3.4

The Range of a Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

66

3.5

Creating New Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

67

3.6

One-to-One Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

68

3.7

Inverse Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

69

3.8

Domain Restriction to Create an Invertible Function . . . . . . . . . . . . . .

74

3.9

Composite Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

75

4 Graphs 4.1

78

Graphing with Your TI84 Graphing Calculator . . . . . . . . . . . . . . . . . .

5 Quick Revision: Exponents, Surds, Absolute Value

84 86

5.1

Laws of Exponents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

86

5.2

Rationalising the Denominator of a Surd . . . . . . . . . . . . . . . . . . . . .

87

5.3

Absolute Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

88

6 Intercepts

90

7 Symmetry

92

7.1

Reflection of a Point in a Line . . . . . . . . . . . . . . . . . . . . . . . . . . . .

92

7.2

Reflection of a Graph in a Line . . . . . . . . . . . . . . . . . . . . . . . . . . .

93

7.3

Lines of Symmetry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

96

8 Limits, Continuity, and Asymptotes

98

8.1

Limits: Introduction and Examples . . . . . . . . . . . . . . . . . . . . . . . . .

8.2

Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

8.3

Limits: More Examples

8.4

Infinite Limits and Vertical Asymptotes . . . . . . . . . . . . . . . . . . . . . . 107

8.5

Limits at Infinity, Horizontal and Oblique Asymptotes . . . . . . . . . . . . . 109

Page 18, Table of Contents

98

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

www.EconsPhDTutor.com

9 Differentiation

112

9.1

Motivation: The Derivative as Slope of the Tangent . . . . . . . . . . . . . . . 112

9.2

Lagrange’s, Leibniz’s, and Newton’s Notation . . . . . . . . . . . . . . . . . . . 115

9.3

The Derivative is a Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

9.4

Second and Higher-Order Derivatives

. . . . . . . . . . . . . . . . . . . . . . . 119

9.5

More About Leibniz’s Notation: The

d Operator . . . . . . . . . . . . . . . . 121 dx

9.6

Standard Rules of Differentiation . . . . . . . . . . . . . . . . . . . . . . . . . . 122

9.7

Differentiable and Twice-Differentiable Functions . . . . . . . . . . . . . . . . 125

9.8

Differentiability Implies (i.e. is Stronger Than) Continuity . . . . . . . . . . . 128

9.9

Implicit Differentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

10 Increasing, Decreasing, and f ′

131

10.1 When a Function is Increasing or Decreasing . . . . . . . . . . . . . . . . . . . 131 10.2 The First Derivative Increasing/Decreasing Test . . . . . . . . . . . . . . . . . 132

11 Extreme, Stationary, and Turning Points

133

11.1 Maximum and Minimum Points . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 11.2 Global Maximum and Minimum Points . . . . . . . . . . . . . . . . . . . . . . 136 11.3 Stationary and Turning Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 11.4 The Interior Extremum Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . 142 11.5 How to Find Maximum and Minimum Points . . . . . . . . . . . . . . . . . . . 144 12 Concavity, Inflexion Points, and the 2DT

147

12.1 The Second Derivative Test (2DT) . . . . . . . . . . . . . . . . . . . . . . . . . 151 12.2 Summary of Points and Venn Diagram

. . . . . . . . . . . . . . . . . . . . . . 153

13 Relating the Graph of f ′ to that of f

155

14 Quick Revision: Quadratic Equations y = ax2 + bx + c

158

Page 19, Table of Contents

www.EconsPhDTutor.com

15 Transformations

162

15.1 y = f (x) + a . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162 15.2 y = f (x + a) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 15.3 y = af (x) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164 15.4 y = f (ax) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165

15.5 Combinations of the Above . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166 15.6 y = ∣f (x)∣ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 15.7 y = f (∣x∣) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168 15.8 y =

1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 f (x)

15.9 y 2 = f (x) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170

16 Conic Sections

172

16.1 The Ellipse x2 + y 2 = 1 (The Unit Circle) . . . . . . . . . . . . . . . . . . . . . . 175

x2 y 2 16.2 The Ellipse 2 + 2 = 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 a b 16.3 The Hyperbola: y = 1/x . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 16.4 The Hyperbola x2 − y 2 = 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181

x2 y 2 16.5 The Hyperbola 2 − 2 = 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182 a b 16.6 The Hyperbola

y 2 x2 − = 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184 b2 a2

16.7 Long Division of Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186 16.8 The Hyperbola y =

bx + c . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189 dx + e

ax2 + bx + c . . . . . . . . . . . . . . . . . . . . . . . . . . . 194 16.9 The Hyperbola y = dx + e

17 Simple Parametric Equations

199

17.1 Eliminating the Parameter t . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204

Page 20, Table of Contents

www.EconsPhDTutor.com

18 Equations and Inequalities 18.1 18.2

207

ax + b > 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208 cx + d

ax2 + bx + c > 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210 dx2 + ex + f

18.3 Solving Inequalities by Graphical Methods . . . . . . . . . . . . . . . . . . . . 214 18.4 Systems of Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217

II

Sequences and Series

19 Finite Sequences

220 221

19.1 A Corresponding Function for a Sequence . . . . . . . . . . . . . . . . . . . . . 222 19.2 Recurrence Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223 19.3 Creating New Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225 20 Infinite Sequences

227

20.1 Creating New Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228 21 Series

229

21.1 Convergent and Divergent Sequences and Series . . . . . . . . . . . . . . . . . 230 22 Summation Notation Σ

232

23 Arithmetic Sequences and Series

235

23.1 Finite Arithmetic Sequences and Series . . . . . . . . . . . . . . . . . . . . . . 236 24 Geometric Sequences and Series

237

24.1 Finite Geometric Sequences and Series . . . . . . . . . . . . . . . . . . . . . . . 238 24.2 Infinite Geometric Sequences and Series . . . . . . . . . . . . . . . . . . . . . . 239 25 Proof by the Method of Mathematical Induction

240

III

246

Vectors

Page 21, Table of Contents

www.EconsPhDTutor.com

26 Quick Revision of Some O-Level Maths

247

26.1 Lines vs. Line Segments vs. Rays . . . . . . . . . . . . . . . . . . . . . . . . . . 247 26.2 Angles are Measured in Radians

. . . . . . . . . . . . . . . . . . . . . . . . . . 248

26.3 Angles - Acute, Right, Obtuse, Straight, Reflex . . . . . . . . . . . . . . . . . . 249 26.4 Triangles - Acute, Right, Obtuse . . . . . . . . . . . . . . . . . . . . . . . . . . 250 26.5 Defining the Trigonometric Functions . . . . . . . . . . . . . . . . . . . . . . . 251 26.6 Formulae for Sine, Cosine, and Tangent . . . . . . . . . . . . . . . . . . . . . . 254 26.7 Arcsine, Arccosine, Arctangent . . . . . . . . . . . . . . . . . . . . . . . . . . . 255 26.8 The Law of Sines and the Law of Cosines . . . . . . . . . . . . . . . . . . . . . 259 27 Vectors in Two Dimensions (2D)

261

27.1 Sum and Difference of Points and Vectors . . . . . . . . . . . . . . . . . . . . . 266 27.2 Sum, Additive Inverse, and Difference of Vectors . . . . . . . . . . . . . . . . . 269 27.3 Displacement Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272 27.4 Length (or Magnitude) of a Vector . . . . . . . . . . . . . . . . . . . . . . . . . 273 27.5 Scalar Multiplication of a Vector . . . . . . . . . . . . . . . . . . . . . . . . . . 275 27.6 Unit Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276 27.7 The Ratio Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279 28 Scalar Product

281

28.1 The Angle between Two Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . 282 28.2 Projection of One Vector on Another . . . . . . . . . . . . . . . . . . . . . . . . 286 28.3 Direction Cosines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288 29 Vectors in 3D

290

30 Vector Product

294

30.1 Vector Product in 2D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294 30.2 Areas of Triangles and Parallelograms . . . . . . . . . . . . . . . . . . . . . . . 296 30.3 Vector Product in 3D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298 Page 22, Table of Contents

www.EconsPhDTutor.com

31 Lines

302

31.1 Lines on a 2D Plane: Cartesian to Vector Equations . . . . . . . . . . . . . . 302 31.2 Lines on a 2D Plane: Vector to Cartesian Equations . . . . . . . . . . . . . . 307 31.3 Lines in 3D Space: Vector Equations . . . . . . . . . . . . . . . . . . . . . . . . 309 31.4 Lines in 3D Space: Vector to and from Cartesian Equations . . . . . . . . . . 311 31.5 Collinearity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 318 32 Planes

320

32.1 Planes: Vector to Cartesian Equations . . . . . . . . . . . . . . . . . . . . . . . 327 32.2 Planes: Hessian Normal Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329 33 Distances

330

33.1 Distance of a Point from a Line . . . . . . . . . . . . . . . . . . . . . . . . . . . 331 33.2 Distance of a Point from a Plane . . . . . . . . . . . . . . . . . . . . . . . . . . 338 34 Angles

342

34.1 Angle between Two Lines (2D) . . . . . . . . . . . . . . . . . . . . . . . . . . . 342 34.2 Angle between Two Lines (3D) . . . . . . . . . . . . . . . . . . . . . . . . . . . 347 34.3 Angle between A Line and a Plane . . . . . . . . . . . . . . . . . . . . . . . . . 350 34.4 Angle between Two Planes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352 35 Relationships between Lines and Planes

354

35.1 Relationship between Two Lines . . . . . . . . . . . . . . . . . . . . . . . . . . . 354 35.2 Relationship between a Line and a Plane . . . . . . . . . . . . . . . . . . . . . 358 35.3 Relationship between Two Planes . . . . . . . . . . . . . . . . . . . . . . . . . . 360 35.4 Relationship between Three Planes . . . . . . . . . . . . . . . . . . . . . . . . . 365

IV

Complex Numbers

36 Complex Numbers: Introduction

371 372

36.1 The Real and Imaginary Parts of Complex Numbers . . . . . . . . . . . . . . 375 Page 23, Table of Contents

www.EconsPhDTutor.com

37 Basic Arithmetic of Complex Numbers

376

37.1 Addition and Subtraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 376 37.2 Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 377 37.3 Division . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 379 38 Solving Polynomial Equations

381

38.1 Complex Roots to Quadratic Equations . . . . . . . . . . . . . . . . . . . . . . 381 38.2 The Fundamental Theorem of Algebra . . . . . . . . . . . . . . . . . . . . . . . 383 38.3 The Complex Conjugate Roots Theorem . . . . . . . . . . . . . . . . . . . . . . 386 39 The Argand Diagram

387

39.1 Complex Numbers in Polar Form . . . . . . . . . . . . . . . . . . . . . . . . . . 389 39.2 Complex Numbers in Exponential Form . . . . . . . . . . . . . . . . . . . . . . 394 40 More Arithmetic of Complex Numbers

395

40.1 The Product of Two Complex Numbers . . . . . . . . . . . . . . . . . . . . . . 395 40.2 The Ratio of Two Complex Numbers . . . . . . . . . . . . . . . . . . . . . . . . 399 40.3 Sine and Cosine as Weighted Sums of the Exponential . . . . . . . . . . . . . 401 41 Geometry of Complex Numbers

404

41.1 The Sum and Difference of Two Complex Numbers . . . . . . . . . . . . . . . 404 41.2 The Product and Ratio of Two Complex Numbers . . . . . . . . . . . . . . . . 406 41.3 Conjugating a Complex Number . . . . . . . . . . . . . . . . . . . . . . . . . . . 408 42 Loci Involving Cartesian Equations

409

42.1 Circles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 409 42.2 Lines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 414 42.3 Intersection of Lines and Circles . . . . . . . . . . . . . . . . . . . . . . . . . . . 418

Page 24, Table of Contents

www.EconsPhDTutor.com

43 Loci Involving Complex Equations

421

43.1 Circles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 421 43.2 Lines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423 43.3 Rays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 424 43.4 Quick O-Level Revision: Properties of The Circle . . . . . . . . . . . . . . . . 427 44 De Moivre’s Theorem

430

44.1 Powers of a Complex Number . . . . . . . . . . . . . . . . . . . . . . . . . . . . 431 44.2 Roots of a Complex Number . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 434

V

Calculus

45 Solving Problems Involving Differentiation

439 440

45.1 Inverse Function Theorem (IFT) . . . . . . . . . . . . . . . . . . . . . . . . . . 440 45.2 Differentiation of Simple Parametric Functions

. . . . . . . . . . . . . . . . . 441

45.3 Equations of Tangents and Normals . . . . . . . . . . . . . . . . . . . . . . . . 442 45.4 Connected Rates of Change Problems . . . . . . . . . . . . . . . . . . . . . . . 443 45.5 Finding Max/Min Points on the TI84 . . . . . . . . . . . . . . . . . . . . . . . 445 45.6 Finding the Derivative at a Point on the TI84 . . . . . . . . . . . . . . . . . . 447 46 The Maclaurin Series

449

46.1 Power Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 449 46.2 Maclaurin Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 450 46.3 The Amazing Maclaurin Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . 452 46.4 Finite-Order Maclaurin Series as Approximations . . . . . . . . . . . . . . . . 454 46.5 Product of Two Power Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 458 46.6 Composition of Two Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 460 46.7 How the Maclaurin Series Works (Optional) . . . . . . . . . . . . . . . . . . . . 463

Page 25, Table of Contents

www.EconsPhDTutor.com

47 The Indefinite Integral

464

47.1 The Constant of Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 466 47.2 The Indefinite Integral is Unique Up to the C.O.I. . . . . . . . . . . . . . . . . 467 48 Integration Techniques

468

48.1 Basic Rules of Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 469 48.2 More Basic Rules of Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . 470 48.3 Trigonometric Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 472 48.4 Integration by Substitution (IBS) . . . . . . . . . . . . . . . . . . . . . . . . . . 473 48.5 Integration by Parts (IBP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 478 49 The Fundamental Theorems of Calculus (FTCs)

479

49.1 The Area Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 479 49.2 The First Fundamental Theorem of Calculus (FTC1) . . . . . . . . . . . . . . 484 49.3 The Definite (or Riemann) Integral

. . . . . . . . . . . . . . . . . . . . . . . . 488

49.4 The Second Fundamental Theorem of Calculus (FTC2) . . . . . . . . . . . . . 489 50 Definite Integrals

490

50.1 Area between a Curve and Lines Parallel to Axes . . . . . . . . . . . . . . . . 491 50.2 Area between a Curve and a Line . . . . . . . . . . . . . . . . . . . . . . . . . . 492 50.3 Area between Two Curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 493 50.4 Area below the x-Axis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 494 50.5 Area under a Parametrically-Defined Curve

. . . . . . . . . . . . . . . . . . . 495

50.6 Volume of Rotation about the y- or x-Axis . . . . . . . . . . . . . . . . . . . . 496 50.7 Finding Definite Integrals on your TI84 . . . . . . . . . . . . . . . . . . . . . . 499 51 Differential Equations dy = f (x) . . . . . . . dx dy 51.2 = f (y) . . . . . . . dx d2 y 51.3 = f (x) . . . . . . . dx2 51.4 Word Problems . . . 51.1

500 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 500 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 501 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 503 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 504

51.5 Family of Solution Curves to Represent the General Solution . . . . . . . . . 507 Page 26, Table of Contents

www.EconsPhDTutor.com

VI

Probability and Statistics

52 How to Count: Four Principles

508 509

52.1 How to Count: The Addition Principle . . . . . . . . . . . . . . . . . . . . . . . 510 52.2 How to Count: The Multiplication Principle . . . . . . . . . . . . . . . . . . . 513 52.3 How to Count: The Inclusion-Exclusion Principle . . . . . . . . . . . . . . . . 516 52.4 How to Count: The Complements Principle . . . . . . . . . . . . . . . . . . . . 518 53 How to Count: Permutations

519

53.1 Permutations with Repeated Elements . . . . . . . . . . . . . . . . . . . . . . . 522 53.2 Circular Permutations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 527 53.3 Partial Permutations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 530 53.4 Permutations with Restrictions . . . . . . . . . . . . . . . . . . . . . . . . . . . 531 54 How to Count: Combinations

533

54.1 Pascal’s Triangle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 536 54.2 The Combination as Binomial Coefficient . . . . . . . . . . . . . . . . . . . . . 537 54.3 The Number of Subsets of a Set is 2n . . . . . . . . . . . . . . . . . . . . . . . . 539 55 Probability: Introduction

541

55.1 Mathematical Modelling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 541 55.2 The Experiment as a Model of Scenarios Involving Chance . . . . . . . . . . . 543 55.3 The Kolmogorov Axioms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 548 55.4 Implications of the Kolmogorov Axioms . . . . . . . . . . . . . . . . . . . . . . 549 56 Probability: Conditional Probability

551

56.1 The Conditional Probability Fallacy (CPF) . . . . . . . . . . . . . . . . . . . . 553 56.2 Two-Boys Problem (Fun, Optional) . . . . . . . . . . . . . . . . . . . . . . . . . 557 57 Probability: Independence

559

57.1 Warning: Not Everything is Independent . . . . . . . . . . . . . . . . . . . . . 564 57.2 Probability: Independence of Multiple Events . . . . . . . . . . . . . . . . . . . 566 Page 27, Table of Contents

www.EconsPhDTutor.com

58 Fun Probability Puzzles

567

58.1 The Monty Hall Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 567 58.2 The Birthday Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 570 59 Random Variables: Introduction

571

59.1 A Random Variable vs. Its Observed Values . . . . . . . . . . . . . . . . . . . 572 59.2 X = k Denotes the Event {s ∈ S ∶ X(s) = k} . . . . . . . . . . . . . . . . . . . . 573

59.3 The Probability Distribution of a Random Variable . . . . . . . . . . . . . . . 574 59.4 Random Variables Are Simply Functions . . . . . . . . . . . . . . . . . . . . . 577

60 Random Variables: Independence

579

61 Random Variables: Expectation

582

61.1 The Expected Value of a Constant R.V. is Constant . . . . . . . . . . . . . . . 585 61.2 The Expectation Operator is Linear . . . . . . . . . . . . . . . . . . . . . . . . 587 62 Random Variables: Variance

589

62.1 The Variance of a Constant R.V. is 0 . . . . . . . . . . . . . . . . . . . . . . . . 595 62.2 Standard Deviation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 596 62.3 The Variance Operator is Not Linear . . . . . . . . . . . . . . . . . . . . . . . . 597 62.4 The Definition of the Variance (Optional) . . . . . . . . . . . . . . . . . . . . . 599 63 The Coin-Flips Problem (Fun, Optional)

600

64 The Bernoulli Trial and the Bernoulli Distribution

601

64.1 Mean and Variance of the Bernoulli Random Variable . . . . . . . . . . . . . 603 65 The Binomial Distribution

604

65.1 Probability Distribution of the Binomial R.V. . . . . . . . . . . . . . . . . . . 606 65.2 The Mean and Variance of the Binomial Random Variable

Page 28, Table of Contents

. . . . . . . . . . 607

www.EconsPhDTutor.com

66 The Poisson Distribution

609

66.1 Formal Definition of the Poisson Random Variable . . . . . . . . . . . . . . . 611 66.2 When is the Poisson Random Variable an Appropriate Model? . . . . . . . . 612 66.3 The Mean and Variance of the Poisson Random Variable

. . . . . . . . . . . 615

66.4 The Poisson Distribution as an Approximation of the Binomial Distribution 616 66.5 The Sum of Two Independent Poisson R.V.’s is a Poisson R.V. . . . . . . . . 619 67 The Continuous Uniform Distribution

622

67.1 The Continuous Uniform Distribution . . . . . . . . . . . . . . . . . . . . . . . 623 67.2 The Cumulative Distribution Function (CDF) . . . . . . . . . . . . . . . . . . 625 67.3 Important Digression: P (X ≤ k) = P (X < k) . . . . . . . . . . . . . . . . . . . 626 67.4 The Probability Density Function (PDF) . . . . . . . . . . . . . . . . . . . . . 627

68 The Normal Distribution

628

68.1 The Normal Distribution, in General . . . . . . . . . . . . . . . . . . . . . . . . 634 68.2 Sum of Independent Normal Random Variables . . . . . . . . . . . . . . . . . 643 68.3 The Central Limit Theorem and The Normal Approximation . . . . . . . . . 647 68.3.1 Normal Approximation to the Binomial Distribution 68.3.2 Normal Approximation to the Poisson Distribution 69 The CLT is Amazing (Optional)

. . . . . . . . . 650 . . . . . . . . . . 651 652

69.1 The Normal Distribution in Nature . . . . . . . . . . . . . . . . . . . . . . . . . 652 69.2 Illustrating the Central Limit Theorem (CLT) . . . . . . . . . . . . . . . . . . 656 69.3 Why Are So Many Things Normally Distributed? . . . . . . . . . . . . . . . . 662 69.4 Don’t Assume That Everything is Normal . . . . . . . . . . . . . . . . . . . . . 663 70 Statistics: Introduction (Optional)

669

70.1 Probability vs. Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 669 70.2 Objectivists vs Subjectivists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 670

Page 29, Table of Contents

www.EconsPhDTutor.com

71 Sampling

672

71.1 Population . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 672 71.2 Population Mean and Population Variance . . . . . . . . . . . . . . . . . . . . 673 71.3 Parameter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 674 71.4 Distribution of a Population . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 675 71.5 A Random Sample . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 676 71.6 Sample Mean and Sample Variance . . . . . . . . . . . . . . . . . . . . . . . . . 678 71.7 Sample Mean and Sample Variance are Unbiased Estimators . . . . . . . . . 684 71.8 The Sample Mean is a Random Variable

. . . . . . . . . . . . . . . . . . . . . 687

71.9 The Distribution of the Sample Mean . . . . . . . . . . . . . . . . . . . . . . . 688 71.10Non-Random Samples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 689 71.11Stratified, Quota, and Systematic Sampling . . . . . . . . . . . . . . . . . . . . 690 72 Null Hypothesis Significance Testing (NHST)

696

72.1 One-Tailed vs Two-Tailed Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . 700 72.2 The Abuse of NHST (Optional) . . . . . . . . . . . . . . . . . . . . . . . . . . . 703 72.3 Common Misinterpretations of the Margin of Error (Optional) . . . . . . . . 704 72.4 Critical Region and Critical Value . . . . . . . . . . . . . . . . . . . . . . . . . . 707 72.5 Testing of a Population Mean 2 (Small Sample, Normal Distribution, σ Known) . . . . . . . . . . . . . . . . . 709 72.6 Testing of a Population Mean (Large Sample, Any Distribution, σ 2 Known) . . . . . . . . . . . . . . . . . . . 711 72.7 Testing of a Population Mean 2 (Large Sample, Any Distribution, σ Unknown) . . . . . . . . . . . . . . . . . 713 72.8 Testing of a Population Mean 2 (Small Sample, Normal Distribution, σ Unknown) . . . . . . . . . . . . . . . 715 72.9 Formulation of Hypotheses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 719

Page 30, Table of Contents

www.EconsPhDTutor.com

73 Correlation and Linear Regression

720

73.1 Bivariate Data and Scatter Diagrams . . . . . . . . . . . . . . . . . . . . . . . . 720 73.2 Product Moment Correlation Coefficient (PMCC) . . . . . . . . . . . . . . . . 722 73.3 Correlation Does Not Imply Causation (Optional) . . . . . . . . . . . . . . . . 727 73.4 Linear Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 728 73.5 Ordinary Least Squares (OLS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 730 73.6 TI84 to Calculate the PMCC and the OLS Estimates . . . . . . . . . . . . . . 735 73.7 Interpolation and Extrapolation . . . . . . . . . . . . . . . . . . . . . . . . . . . 737 73.8 Transformations to Achieve Linearity . . . . . . . . . . . . . . . . . . . . . . . . 745 73.9 The Higher the PMCC, the Better the Model? . . . . . . . . . . . . . . . . . . 749

VII

Ten-Year Series

751

74 Past-Year Questions for Part I: Functions and Graphs

752

75 Past-Year Questions for Part II: Sequences and Series

763

76 Past-Year Questions for Part III: Vectors

773

77 Past-Year Questions for Part IV: Complex Numbers

782

78 Past-Year Questions for Part V: Calculus

788

79 Past-Year Questions for Part VI: Prob. and Stats.

816

VIII

847

Appendices (Optional)

80 Appendices for Part I: Functions and Graphs

848

80.1 Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 848 80.2 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 849 80.3 Reflection in a Line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 850 bx + c . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 852 80.4 The Hyperbola y = dx + e ax2 + bx + c 80.5 The Hyperbola y = . . . . . . . . . . . . . . . . . . . . . . . . . . . 853 dx + e

Page 31, Table of Contents

www.EconsPhDTutor.com

81 Appendices for Part II: Sequences and Series

855

82 Appendices for Part III: Vectors

856

82.1 Vectors in 2D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 856 82.2 Scalar Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 857 82.3 The Ratio Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 858 82.4 Vector Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 859 82.5 2D Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 862 82.6 3D Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 863 83 Appendices for Part IV: Complex Numbers

869

84 Appendices for Part V: Calculus

872

84.1 Limits Formally Defined . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 872 84.2 Left- and Right-Sided Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 874 84.3 Infinite Limits and Vertical Asymptotes . . . . . . . . . . . . . . . . . . . . . . 875 84.4 Limits at Infinity, Horizontal, and Oblique Asymptotes

. . . . . . . . . . . . 876

84.5 Limit Laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 877 84.6 Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 880 84.7 Differentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 881 84.8 Differentiability Implies Continuity . . . . . . . . . . . . . . . . . . . . . . . . . 887 84.9 Maximum, Minimum, and Turning Points . . . . . . . . . . . . . . . . . . . . . 888 84.10Concavity and Inflexion Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . 890 84.11Concavity and Inflexion Points with Differentiability . . . . . . . . . . . . . . 892 84.12Inverse Function Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 895 84.13Parametric Differentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 896 84.14Maclaurin Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 897 84.15Product of Two Power Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 901 84.16Composition of Two Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 903 84.17The Fundamental Theorems of Calculus . . . . . . . . . . . . . . . . . . . . . . 904 84.18The Natural Logarithm and Euler’s Number e . . . . . . . . . . . . . . . . . . 908 84.19Euler’s Number . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 910 Page 32, Table of Contents

www.EconsPhDTutor.com

85 Appendices for Part VI: Probability and Statistics

912

85.1 How to Count . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 912 85.2 Circular Permutations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 914 85.3 Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 915 85.4 Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 916 85.5 The Poisson Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 920 85.6 The Normal Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 922 85.7 Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 925 85.8 Null Hypothesis Significance Testing . . . . . . . . . . . . . . . . . . . . . . . . 927 85.9 Calculating the Margin of Error . . . . . . . . . . . . . . . . . . . . . . . . . . . 928 85.10Correlation and Linear Regression . . . . . . . . . . . . . . . . . . . . . . . . . 930 85.10.1 Deriving a Linear Model from the Barometric Formula . . . . . . . . . 932

IX

Answers to Exercises

86 Answers to Exercises in Part I: Functions and Graphs

933 934

86.1 Answers to Exercises in Ch. 1: Sets . . . . . . . . . . . . . . . . . . . . . . . . . 934 86.2 Answers to Exercises in Ch. 2: Dividing by Zero . . . . . . . . . . . . . . . . . 935 86.3 Answers to Exercises in Ch. 3: Functions . . . . . . . . . . . . . . . . . . . . . 936 86.4 Answers to Exercises in Ch. 4. Graphs . . . . . . . . . . . . . . . . . . . . . . . 942 86.5 Answers to Exercises in Ch. 5. Quick Revision . . . . . . . . . . . . . . . . . . 945 86.6 Answers to Exercises in Ch. 6. Intercepts . . . . . . . . . . . . . . . . . . . . . 947 86.7 Answers to Exercises in Ch. 7. Symmetry . . . . . . . . . . . . . . . . . . . . . 948 86.8 Answers to Exercises in Ch. 8. Limits, Continuity, and Asymptotes . . . . . 949 86.9 Answers to Exercises in Ch. 9. Differentiation . . . . . . . . . . . . . . . . . . 950 86.10Answers to Exercises in Ch. 11. Stationary, Maximum, Minimum, and Inflexion Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 952 86.11Answers to Exercises in Ch. 14. Quadratic Equations . . . . . . . . . . . . . . 960 86.12Answers to Exercises in Ch. 15. Transformations . . . . . . . . . . . . . . . . 961 86.13Answers to Exercises in Ch. 16: Conic Sections . . . . . . . . . . . . . . . . . 963 86.14Answers to Exercises in Ch. 17. Simple Parametric Equations . . . . . . . . . 975 86.15Answers to Exercises in Ch. 18: Equations and Inequalities . . . . . . . . . . 980 Page 33, Table of Contents

www.EconsPhDTutor.com

87 Answers to Exercises in Part II: Sequences and Series

992

87.1 Answers for Ch. 19: Finite Sequences . . . . . . . . . . . . . . . . . . . . . . . 992 87.2 Answers for Ch. 20: Infinite Sequences . . . . . . . . . . . . . . . . . . . . . . . 994 87.3 Answers for Ch. 22: Summation . . . . . . . . . . . . . . . . . . . . . . . . . . . 995 87.4 Answers for Ch. 23: Arithmetic Sequences and Series . . . . . . . . . . . . . . 996 87.5 Answers for Ch. 24: Geometric Sequences and Series . . . . . . . . . . . . . . 997 87.6 Answers for Ch. 25: Proof by Induction . . . . . . . . . . . . . . . . . . . . . . 998 88 Answers to Exercises in Part III: Vectors

1001

88.1 Answers for Ch. 26: Quick Revision . . . . . . . . . . . . . . . . . . . . . . . . 1001 88.2 Answers for Ch. 27: Vectors in 2D . . . . . . . . . . . . . . . . . . . . . . . . . 1002 88.3 Answers for Ch. 29: Vectors in 3D . . . . . . . . . . . . . . . . . . . . . . . . . 1007 88.4 Answers for Ch. 30: Vector Product . . . . . . . . . . . . . . . . . . . . . . . . 1010 88.5 Answers for Ch. 31: Lines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1011 88.6 Answers for Ch. 32: Planes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1014 88.7 Answers for Ch. 33: Distances . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1016 88.8 Answers for Ch. 34: Angles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1023 88.9 Answers for Ch. 35: Relationships between Lines and Planes . . . . . . . . . 1026 89 Answers to Exercises in Part IV: Complex Numbers

1030

89.1 Answers for Ch. 36: Introduction to Complex Numbers . . . . . . . . . . . . . 1030 89.2 Answers for Ch. 37: Basic Arithmetic of Complex Numbers . . . . . . . . . . 1031 89.3 Answers for Ch. 38: Solving Polynomial Equations . . . . . . . . . . . . . . . 1033 89.4 Answers for Ch. 39: The Argand Diagram . . . . . . . . . . . . . . . . . . . . 1036 89.5 Answers for Ch. 40: More Arithmetic of Complex Numbers . . . . . . . . . . 1039 89.6 Answers for Ch. 41: Geometry of Complex Numbers . . . . . . . . . . . . . . 1043 89.7 Answers for Ch. 42: Loci Involving Cartesian Equations . . . . . . . . . . . . 1044 89.8 Answers for Ch. 43: Loci Involving Complex Equations . . . . . . . . . . . . . 1048 89.9 Answers for Ch. 44: De Moivre’s Theorem . . . . . . . . . . . . . . . . . . . . 1051 Page 34, Table of Contents

www.EconsPhDTutor.com

90 Answers to Exercises in Part V: Calculus

1054

90.1 Answers for Ch. 45: Solving Problems Involving Differentiation . . . . . . . . 1054 90.2 Answers for Ch. 46: Maclaurin Series . . . . . . . . . . . . . . . . . . . . . . . 1056 90.3 Answers for Ch. 47: The Indefinite Integral . . . . . . . . . . . . . . . . . . . . 1059 90.4 Answers for Ch. 48: Integration Techniques . . . . . . . . . . . . . . . . . . . . 1060 90.5 Answers for Ch. 49: The Fundamental Theorems of Calculus . . . . . . . . . 1069 90.6 Answers for Ch. 50: Definite Integrals . . . . . . . . . . . . . . . . . . . . . . . 1070 90.7 Answers for Ch. 51: Differential Equations . . . . . . . . . . . . . . . . . . . . 1073 91 Answers to Exercises in Part VI: Probability and Statistics

1078

91.1 Answers for Ch. 52: How to Count: Four Principles . . . . . . . . . . . . . . . 1078 91.2 Answers for Ch. 53: How to Count: Permutations . . . . . . . . . . . . . . . . 1081 91.3 Answers for Ch. 54: How to Count: Combinations . . . . . . . . . . . . . . . . 1083 91.4 Answers for Ch. 55: Probability: Introduction . . . . . . . . . . . . . . . . . . 1086 91.5 Answers for Ch. 56: Conditional Probability . . . . . . . . . . . . . . . . . . . 1090 91.6 Answers for Ch. 57: Probability: Independence . . . . . . . . . . . . . . . . . 1091 91.7 Answers for Ch. 59: Random Variables: Introduction . . . . . . . . . . . . . . 1092 91.8 Answers for Ch. 60: Random Variables: Independence . . . . . . . . . . . . . 1096 91.9 Answers for Ch. 61: Random Variables: Expectation . . . . . . . . . . . . . . 1096 91.10Answers for Ch. 62: Random Variables: Variance . . . . . . . . . . . . . . . . 1098 91.11Answers for Ch. 63: The Coin-Flips Problem . . . . . . . . . . . . . . . . . . . 1098 91.12Answers for Ch. 64: Bernoulli Trial and Distribution . . . . . . . . . . . . . . 1098 91.13Answers for Ch. 65: Binomial Distribution . . . . . . . . . . . . . . . . . . . . 1099 91.14Answers for Ch. 66: Poisson Distribution . . . . . . . . . . . . . . . . . . . . . 1100 91.15Answers for Ch. 67: Continuous Uniform Distribution . . . . . . . . . . . . . 1102 91.16Answers for Ch. 68: Normal Distribution . . . . . . . . . . . . . . . . . . . . . 1103 91.17Answers for Ch. 71: Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1110 91.18Answers for Ch. 72: Null Hypothesis Significance Testing . . . . . . . . . . . 1113 91.19Answers for Ch. 73: Correlation and Linear Regression . . . . . . . . . . . . . 1118 Page 35, Table of Contents

www.EconsPhDTutor.com

92 Answers to Exercises in Part VII (2006-2015 A-Level Exams) 1122 92.1 Answers for Ch. 74: Functions and Graphs . . . . . . . . . . . . . . . . . . . . 1122 92.2 Answers for Ch. 75: Sequences and Series . . . . . . . . . . . . . . . . . . . . . 1150 92.3 Answers for Ch. 76: Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1179 92.4 Answers for Ch. 77: Complex Numbers . . . . . . . . . . . . . . . . . . . . . . 1193 92.5 Answers for Ch. 78: Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1216 92.6 Answers for Ch. 79: Probability and Statistics . . . . . . . . . . . . . . . . . . 1266

Page 36, Table of Contents

www.EconsPhDTutor.com

Part I

Functions and Graphs

Page 37, Table of Contents

www.EconsPhDTutor.com

1

Sets

The glory of [maths] is its complete irrelevance to our lives. That’s why it’s so fun! Paul Lockhart (2009, A Mathematician’s Lament, p. 38). I have never done anything ‘useful’. No discovery of mine has made, or is likely to make, directly or indirectly, for good or ill, the least difference to the amenity of the world. - G.H. Hardy (1940 [1967], A Mathematician’s Apology, p. 150).

The set is the basic building block of mathematics. Informally, a set is a “container” that usually has some objects in it, but can sometimes also be empty. Each object in a set is called an element (of that set). Example 1. Let A = {3, π 2 , Clementi Mall, Love, the colour green}. Observations:

• The name of a set is often an upper-case letter; in this case, it is A.

• Mathematical punctuation marks called braces {} are used to denote a set. Listed within these braces are the elements of the set.

• Elements of the set are separated by commas (,). This mathematical punctuation mark means “and”. • Thus, {3, π 2 , Clementi Mall, Love, the colour green} is the set consisting of five elements, namely 3 and π 2 and Clementi Mall and Love and the colour green. • Elements in a set can be almost anything whatsoever! In this example, the elements included a building (Clementi Mall), an abstract notion (Love), and even a colour (green). The elements of a set can even be another set! But don’t worry, in the context of A-level maths, the elements of a set will almost always be numbers. • When we talk about a set, we refer to both the “container” itself and all the objects in it.

Exercise 1. B is the set of the first 7 positive integers. Write down B in set notation. (Answer on p. 934.) Exercise 2. C is the set of even prime numbers. Write down C in set notation. (Answer on p. 934.)

Page 38, Table of Contents

www.EconsPhDTutor.com

1.1

In ∈ and Not In ∉

The mathematical punctuation mark ∈ means “is in”, while ∉ means “is not in”.

Example 2. Let B = {1, 2, 3, 4, 5, 6, 7}. Then 1 ∈ B, 2 ∈ B, 3 ∈ B, etc. You can read these statements aloud as “1 is in B”, “2 is in B”, “3 is in B”, etc. We can also write 1, 2, 3 ∈ B (“1, 2, and 3 are in B”).

Also, 8 ∉ B, 9 ∉ B, 10 ∉ B, etc. (“8 is not in B”, “9 is not in B”, “10 is not in B”, etc.). We can also write 8, 9, 10 ∉ B (“8, 9, and 10 are not in B”).

Example 3. Cow ∈ {Cow, Chicken} reads aloud as “Cow is in the set consisting of Cow and Chicken”. Cow, Chicken ∈ {Cow, Chicken} reads aloud as “Cow and Chicken are in the set consisting of Cow and Chicken”.

Page 39, Table of Contents

www.EconsPhDTutor.com

1.2

Greater than >, Less Than <, Positive > 0, and Negative < 0

In this textbook:

• Greater than means “strictly greater than” (>). So I won’t bother saying “strictly”, unless it’s something I want to emphasise. • Less than means “strictly less than” (<). • If I want to say greater than or equal to (≥) or smaller than or equal to (≤), I’ll say exactly that. • Positive means “greater than zero” (> 0). • Negative means “less than zero” (< 0). • Non-negative means “greater than or equal to zero” (≥ 0). • Non-positive means “less than or equal to zero” (≤ 0). • 0 is neither positive nor negative. Instead, 0 is both non-negative and non-positive.

Page 40, Table of Contents

www.EconsPhDTutor.com

1.3

Types of Numbers

The following taxonomy lists the several types of numbers you’ll encounter in this textbook.

Complex Numbers

Real Numbers

Rational Numbers

Imaginary Numbers

Irrational Numbers

Integers NonIntegers

We’ll study imaginary numbers only later on in Part IV of this textbook. For now, all numbers we’ll consider are real numbers (or reals). We won’t define what real numbers are. Instead, we’ll simply assume (like in secondary school) that “everyone knows” what real numbers are. Infinity (∞) and negative infinity (−∞) are NOT numbers. Informally, ∞ is the “thing” that is greater than any real number. Similarly, −∞ is the “thing” that is smaller than any real number. I repeat: INFINITY IS NOT A NUMBER.17 So what exactly are real numbers, infinity, and negative infinity? This is actually a fascinating question that mathematicians were able to answer satisfactorily only from the 19th century, but is beyond the scope of the A-levels.

Definition 1. An integer is any one of these real numbers: . . . , −3, −2, −1, 0, 1, 2, 3, . . . Definition 2. A rational number (or simply rationals) is any real number that can be expressed as the ratio of two integers. An irrational number (or simply irrationals) is any other real number. Example 4. The number 16 is an integer, a rational, and a real. Example 5. The number 1.87 is a rational and a real, but it is not an integer. Example 6. The number π ≈ 3.14159 is an irrational and a real, but it is neither an integer nor a rational. 17

Actually, the truth is somewhat more complicated. Under certain special contexts in more advanced mathematics, infinity is treated as a number. But in this textbook, I’ll simply keep it simple and insist that infinity is not a number.

Page 41, Table of Contents

www.EconsPhDTutor.com

1.4

The Order of the Elements Doesn’t Matter

The order in which we write out the elements of the set does not matter: Definition 3. Two sets are equal (or identical) if both sets contain exactly the same elements. Example 7. There are at least six equivalent ways to write the set of the 3 smallest positive even numbers: {2, 4, 6} = {2, 6, 4} = {4, 2, 6} = {4, 6, 2} = {6, 2, 4} = {6, 4, 2}. Example 8. {Cow, Chicken} = {Chicken, Cow}.

Page 42, Table of Contents

www.EconsPhDTutor.com

1.5

Repeated Elements Don’t Count

Repeated elements are simply ignored. Example 9. The set of the 3 smallest positive even numbers can be written as {2, 4, 6}. It can also be written as: {2, 2, 4, 6} or {2, 6, 6, 6, 4, 4}. Repeated elements are simply ignored.

The notation n({2, 4, 6}) denotes the number of elements in the set of the first 3 even numbers. Hence, n({2, 4, 6}) = 3. And we also have n({2, 2, 4, 6}) = 3 and n({2, 6, 6, 6, 4, 4}) = 3.

Example 10. {Cow, Chicken} = {Cow, Cow, Chicken} = {Chicken, Cow, Chicken}. And n({Cow, Chicken}) = n({Cow, Cow, Chicken}) = n({Chicken, Cow, Chicken}) = 2.

Note that more commonly, the number of elements in the set A is written as ∣A∣. But for some reason, the A-level syllabus instead uses the notation n(A), so that’s what we’ll use. Exercise 3. W = {Apple, Apple, Apple, Banana, Banana, Apple}. What is n(W )? (Answer on p. 934.) Exercise 4. C is the set of even prime numbers. What is n(C)? (Answer on p. 934.)

Page 43, Table of Contents

www.EconsPhDTutor.com

1.6

Ellipsis . . . Means Continue in the Obvious Fashion

The mathematical punctuation mark “. . . ” is called the ellipsis and means “continue in the obvious fashion”. Example 11. D is the set of all odd positive integers smaller than 100. So in set notation, we can write D = {1, 3, 5, 7, 9, 11, . . . , 99}.

Example 12. T is the set of all negative integers greater than −100. So in set notation, we can write T = {−99, −98, −97, . . . , −2, −1}.

What is obvious to one person might not be obvious to another. So only use the ellipsis when you’re confident it will be obvious to your reader! And never be shy to write a few more of the set’s elements (as I did with the sets above)! Exercise 5. Let D and T be as in the above two examples. What are n(D) and n(T )? (Answer on p. 934.)

Page 44, Table of Contents

www.EconsPhDTutor.com

1.7

Sets can be Finite or Infinite

Example 13. Z+ is the set of all positive integers. So, Z+ = {1, 2, 3, . . . }. And since Z+ is infinite, we write n(Z+ ) = ∞.

Example 14. Z is the set of all integers. So, Z = {. . . , −3, −2, −1, 0, 1, 2, 3, . . . }. And since Z is infinite, we write n(Z) = ∞. Obviously, for an infinite set, we cannot explicitly list out all of its elements. So we’ll often use ellipses to help out, as we did in the above examples. Alternatively, we can use interval notation or set-builder notation, which we’ll learn about shortly.

Exercise 6. H is the set of all prime numbers. Write down H in set notation. (Answer on p. 934.)

Page 45, Table of Contents

www.EconsPhDTutor.com

1.8

Special Names of Sets: Z, Q, R, and C

The following sets are so common that they have special symbols: 1. Z = {. . . , −3, −2, −1, 0, 1, 2, 3, . . . } is the set of all integers. (Z is for Zahl, German for number.) 2. Q is the set of all rational numbers. (Q is for quoziente, Italian for quotient.) 3. R is the set of all real numbers. 4. C is the set of all complex numbers. (To be studied only in Part IV of this textbook.) To create a new set that contains only the positive (or negative) elements of the old set, append a superscript plus (+ ) or minus (− ) to the name of a set:

1. Z+ = {1, 2, 3, . . . } is the set of all positive integers. Z− = {. . . , −3, −2, −1} is the set of all negative integers. 2. Q+ is the set of all positive rational numbers. Q− is the set of all negative rational numbers. 3. R+ is the set of all positive real numbers. R− is the set of all negative real numbers. As we’ll learn later, there is no such thing as a positive or negative complex number. Hence, there is no such set named C+ or C− . To add the number 0 to a set, append a subscript zero (0 ) to its name:

Example 15. The set A = {3, π 2 , Clementi Mall, Love, the colour green}. And so the set A0 = {3, π 2 , Clementi Mall, Love, the colour green, 0}. Example 16. The set B = {1, 2, 3, 4, 5, 6, 7}. And so the set B0 = {0, 1, 2, 3, 4, 5, 6, 7}.

Adding both a superscript + and a subscript 0 to the name of a set creates a new set that contains all positive elements of the old set and in addition the number 0. Similarly, adding both a superscript − and a subscript 0 to the name of a set creates a new set that contains all negative elements of the old set and in addition the number 0. Example 17. If V = {−2, −1, 3, 4}, then V + = {3, 4}, V − = {−2, −1}, V0+ = {0, 3, 4}, and V0− = {−2, −1, 0}. Exercise 7. If U = {−1, 0, 2}, then what are U + , U − , U0 , U0+ , and U0− ? (Answer on p. 934.) Page 46, Table of Contents

www.EconsPhDTutor.com

1.9

Special Names of Sets: Intervals

Here are four new mathematical punctuation marks: The left-parenthesis: ( The right-parenthesis: )

The left-bracket: [ The right-bracket: ]

Together, () are called parentheses and [] are called brackets.

An interval is a (usually infinite) set of real numbers. It is written using parentheses and/or brackets. Let a and b be real numbers where b ≥ a. Then: 1. (a, b) is the set of all real numbers that are greater than a and smaller than b. Such a set is also called an open interval.

Example 18. The set I = (0, 3) denotes √ the set of all real numbers that are greater than 0 and smaller than 3. So for example, 2 ≈ 1.41 ∈ I, but 0, 3 ∉ I.

2. [a, b] is the set of all real numbers that are greater than or equal to a and smaller than or equal to b. Such a set is also called an closed interval. Example 19. The set J = [0, 3] denotes the set of all real numbers that are √ greater than or equal to 0 and smaller than or equal to 3. So for example, the numbers 0, 2, 3 ∈ J. 3. (a, b] is the set of all real numbers that are greater than a and smaller than or equal to b. Such a set is also called a half-open interval or a half-closed interval.

Example 20. The set K = (0, 3] denotes the set of all real numbers that are greater than √ 0 and smaller than or equal to 3. So for example, the numbers 2, 3 ∈ K, but 0 ∉ K. 4. [a, b) is the set of all real numbers that are greater than or equal to a and smaller than b. Such a set is also called a half-open interval or a half-closed interval.

Example 21. The set L = [0, 3) denotes the set of all real numbers √ that are greater than or equal to 0 and smaller than 3. So for example, the numbers 0, 2 ∈ L, but 3 ∉ L. Exercise 8. How many elements does the set Z = [1, 1] contain? (Answer on p. 934.)

Exercise 9. How many elements does the set Y = (1, 1) contain? (Answer on p. 934.)

Exercise 10. How many elements does the set X = (1, 1.01) contain? (Answer on p. 934.) Exercise 11. Write down R, R+ , R+0 , R− , and R−0 in interval notation. (Answer on p. 934.)

Page 47, Table of Contents

www.EconsPhDTutor.com

1.10

Special Names of Sets: The Empty Set ∅

The empty set is literally the set that contains no elements. Hence the name! Definition 4. The empty set is the set {}. It can also be denoted ∅.

Example 22. In 2016, the set of all Singapore Ministers who are younger than 30 is {} or ∅. This means that there is no Singapore Minister who is younger than 30. Example 23. The set of all even prime numbers greater than 2 is {} or ∅. This means that there is no even prime number that is greater than 2.

Example 24. The set of numbers that are greater than 4 and smaller than 4 is {} or ∅. This means that there is no number that is simultaneously greater than 4 and smaller than 4. As already mentioned, in this textbook (and also for the A-levels), the elements in a set will almost always be numbers. But in general, the elements of a set can be (nearly) anything whatsoever. In other words, a set really and simply is a “container” that can “contain” (nearly) anything whatsoever. Indeed, the elements of a set can be other sets, including even the empty set! Here are two examples to illustrate: Example 25. The set {∅} is not the same as the set ∅. The former is a set containing a single element, namely the empty set. The latter is the empty set. It is perhaps clearer if we rewrite them as {∅} = {{}} and ∅ = {} . Now we clearly see that {{}} ≠ {}.

Note that the set {∅} = {{}} is certainly not empty, because it contains a single element (namely the empty set). Example 26. The set {∅, 1, {∅}} is the set containing three elements, namely the empty set, the number 1, and a set containing the empty set. Page 48, Table of Contents

www.EconsPhDTutor.com

1.11

Subset Of ⊆

Definition 5. A is a subset of B (written as A ⊆ B) if every element in A is also an element in B. Not surprisingly, A ⊈/ B denotes that A is not a subset of B.

Example 27. Let M = {1, 2}, N = {1, 2, 3}, and O = {1, 2, 4, 5}. Then M ⊆ N , but N ⊈ M . Also, M ⊆ O, but O ⊈ M . Further, N ⊈ O and O ⊈ N . Exercise 12. State whether Z, Q, and R are subsets of each other. (Answer on p. 934.) Exercise 13. True or false: “The set of currently-serving Singapore Prime Ministers is a subset of the set of currently-serving Singapore Ministers.” (Answer on p. 934.)

The next fact is useful for showing that two sets are equal. Fact 1. Two sets are subsets of each other ⇐⇒ They are identical. Proof. Optional, see p. 848 in the Appendices. The symbol ⇐⇒ stands for is equivalent to or if and only if. The above claim may be decomposed into two separate claims:

1. Two sets are subsets of each other Ô⇒ they are identical. (The symbol Ô⇒ stands for implies or only if.) 2. Two sets are subsets of each other ⇐Ô they are identical. (The symbol ⇐Ô stands for is implied by or if.) Note importantly that A Ô⇒ B is different from its converse B Ô⇒ A. For example, x > 10 Ô⇒ x > 3, but it is certainly not the case that x > 3 Ô⇒ x > 10.

Page 49, Table of Contents

www.EconsPhDTutor.com

1.12

Proper Subset Of ⊂

Definition 6. A is a proper subset of B (written as A ⊂ B) if A ⊆ B but A ≠ B.

Not surprisingly, A ⊂/ B denotes that A is not a proper subset of B.

Example 28. Let M = {1, 2}, N = {1, 2, 3}, O = {1, 2, 4, 5}, and P = {1, 2, 3}. Then M ⊆ N, O, P and M ⊂ N, O, P . In contrast, N ⊆ P , but N ⊂/ P ; this is because N = P . Exercise 14. Is the set of all squares (call it S) a proper subset of the set of all rectangles (call it R)? (Answer on p. 934.) Exercise 15. Does A ⊆ B imply that A ⊂ B? (Answer on p. 935.)

Exercise 16. Does A ⊂ B imply that A ⊆ B? (Answer on p. 935 .)

Exercise 17. True or false statement: “If A is a subset of B, then A is either a proper subset of or is equal to B.” (Answer on p. 935.)

Remark 1. The official A-level syllabus uses the symbol ⊆ to mean “subset of” and ⊂ to mean “proper subset of”. So this is what we’ll use in this textbook. However, confusingly enough, many writers use the symbol ⊂ to mean “subset of” and ⊊ to mean “proper subset of”. We will not follow such practice in this textbook. Just to let you know, in case you get confused while reading other mathematical texts!

Page 50, Table of Contents

www.EconsPhDTutor.com

1.13

Union ∪

Definition 7. The union of the sets A and B (denoted A ∪ B) is the set of elements that are either in A OR B. Tip: “U” for Union. Example 29. Let T = {1, 2}, U = {3, 4}, and V = {1, 2, 3}. Then T ∪ U = {1, 2, 3, 4}, T ∪ V = {1, 2, 3}, and U ∪ V = {1, 2, 3, 4}. And T ∪ U ∪ V = {1, 2, 3, 4}. Exercise 18. Rewrite each of the following sets more simply: (a) [1, 2] ∪ [2, 3]. (b) (−∞, −3) ∪ [−16, 7). (c) {0} ∪ Z+ ? (Answer on p. 935.)

Exercise 19. What is the union of the set of squares (S) and the set of rectangles (R)? (Answer on p. 935.) Exercise 20. What is the union of the set of rationals (Q) and the set of irrationals? (Answer on p. 935.)

Page 51, Table of Contents

www.EconsPhDTutor.com

1.14

Intersection ∩

Definition 8. The intersection of the sets A and B (denoted A ∩ B) is the set of elements that are in A AND B. Definition 9. Two sets intersect if their intersection contains at least one element (i.e. A ∩ B ≠ ∅). Definition 10. Two sets are mutually exclusive or disjoint if their intersection is empty (i.e. A ∩ B = ∅).

Example 30. Let T = {1, 2}, U = {3, 4}, and V = {1, 2, 3}. Then T ∩ U = ∅, T ∩ V = {1, 2}, and U ∩ V = {3}. And T ∩ U ∩ V = ∅. Exercise 21. Rewrite each of the following sets more simply: (a) (4, 7] ∩ (6, 9). (b) [1, 2] ∩ [5, 6]. (c) (−∞, −3) ∩ (−16, 7). (Answer on p. 935.)

Exercise 22. What is the intersection of the set of squares (S) and the set of rectangles (R)? (Answer on p. 935.) Exercise 23. What is the intersection of the set of rationals (Q) and the set of irrationals? (Answer on p. 935.)

Page 52, Table of Contents

www.EconsPhDTutor.com

1.15

Set Minus /

The set minus (sometimes also called set difference) operator is very convenient. Sadly, it is not in the A-level syllabus and so I’ll avoid using it in this textbook. Nonetheless, it’s worth a quick mention. Definition 11. A set minus B (denoted A/B or A − B) is the set that contains every element in A that is not also in B. Example 31. Let T = {1, 2}, U = {3, 4}, and V = {1, 2, 3}. Then T /U = T , T /V = ∅, and U /V = {4}.

Page 53, Table of Contents

www.EconsPhDTutor.com

1.16

Set Complement A′

Definition 12. The set complement of A (denoted A′ or Ac ) is the set of all elements that are not in A. Example 32. Consider the set of positive integers. Let A = {2, 4, 6, 8, 10, . . . }. Then in this context, A′ = {1, 3, 5, 7, 9, 11, . . . }. Example 33. Consider the set of all reals. Let A = R+ . Then in this context, A′ = R−0 .

Example 34. We roll a die, hoping for an outcome of either 1 or 6. Our set of desired outcomes may thus be written as A = {1, 6}.

Unfortunately, we do not get any of our desired outcomes. We may thus say that the actual outcome was an element of the set A′ = {2, 3, 4, 5}.

Page 54, Table of Contents

www.EconsPhDTutor.com

1.17

Set-Builder Notation

Set-builder notation is an alternative method of writing down sets. In the current context, the mathematical punctuation mark colon “∶” will mean “such that”. Example 35. The set {x ∈ R ∶ x > 0} contains all x ∈ R such that x > 0. In words, this set contains all real numbers that are positive. What comes after the colon are the conditions or criteria that x must satisfy, in order to qualify as a member of the set. Our sets will usually contain only numbers, but here’s an example to show you how we can write down one particular set of musical artists. Example 36. The set {x ∶ x is an artist that has had a US Billboard Hot 100 #1 Single} contains all the artists who have ever had a US Billboard Hot 100 #1 Single. It will however be more typical for our sets to be sets such as these: Example 37. {x ∈ R ∶ x > 0} = R+ , Q+ = {x ∈ Q ∶ x > 0}, Z+ = {x ∈ Z ∶ x > 0}, R+0 = {x ∈ R ∶ x ≥ 0}, Q+0 = {x ∈ Q ∶ x ≥ 0}, and Z+0 = {x ∈ Z ∶ x ≥ 0}. Remark 2. We use the colon ∶ but some writers use instead the pipe ∣.

Exercise 24. Write down R− , Q− , Z− , R−0 , Q−0 , and Z−0 in set-builder notation. (Answer on p. 935.) Exercise 25. Write down (a, b), [a, b], (a, b], and [a, b) in set-builder notation. (Answer on p. 935.) Exercise 26. Let X = {x ∶ x is a living current or former Prime Minister of Singapore}. Write down the set X so that all its elements are explicitly stated. (Answer on p. 935.) Exercise 27. Rewrite each of the following sets in set-builder notation: (a) (−∞, −3) ∪ √ (5, ∞). (b) (−∞, 2] ∪ (e, π) ∪ (π, ∞). (c) (−∞, 3) ∩ (0, 7). (Answer on p. 935.)

Page 55, Table of Contents

www.EconsPhDTutor.com

2

Dividing By Zero

This very brief chapter is to warn you against making a common mistake — dividing by 0. Students have little trouble avoiding this mistake if the divisor is obviously a big fat 0. Instead, students usually make this mistake when the divisor is an unknown constant or variable that might be 0. Example 38. Find the values of x for which x(x − 1) = (2x − 2)(x − 1).

Here’s the wrong solution: “Divide both sides by x − 1 to get x = 2x − 2. So x = 2.”

Here’s the correct solution: “Case #1. Suppose x − 1 = 0. Then the given equation is satisfied. So x = 1 is one possible value for which x(x − 1) = (2x − 1)(x − 1). Case #2. Now suppose x − 1 ≠ 0. So we can divide both sides by x − 1 to get x = 2x − 2. So x = 2. Conclusion. The two possible values of x for which x(x − 1) = (2x − 1)(x − 1) are x = 1 and x = 2.”

Moral of the story. Whenever you divide by a certain quantity, make sure it’s non-zero. If you’re not sure whether it equals 0, then break up your analysis into two cases, as was done in the above example: Case #1 — the quantity equals 0 (and see what happens in this case); Case #2 — the quantity is non-zero (in which case you can go ahead and divide).

By the way, let’s take this opportunity to clear up another popular misconception — You may have heard that 1/0 = ∞. This is wrong. 1/0 ≠ ∞. Instead, any non-zero number divided by 0 is undefined.18 “Undefined” is the mathematician’s way of saying, “You haven’t told me what you are talking about. So what you are saying is meaningless.” Exercise 28. What’s wrong with this “proof” that 1 = 0? (Answer on p. 935.)

1. Let x, y be positive numbers such that x = y. 2. Square both sides: x2 = y 2 .

3. Rearrange: x2 − y 2 = 0 4. Factorise: (x − y)(x + y) = 0.

5. Divide both sides by x − y to get x + y = 0. 6. Since x = y, sub y = x into the above equation to get 2x = 0. 7. Divide both sides by 2x to get 1 = 0.

18

A special case is 0/0, which is indeterminate. This means that 0/0 is sometimes undefined, but can sometimes be defined under certain circumstances.

Page 56, Table of Contents

www.EconsPhDTutor.com

3

Functions

Undoubtedly the most important concept in all of mathematics is that of a function — in almost every branch of modern mathematics functions turn out to be the central objects of investigation. - Michael Spivak (1994 [2006], Calculus, p. 39). You are probably familiar from secondary school with such statements as: “Let f (x) = x + 8 be a function.” Strictly speaking, this is not the correct way of describing a function. Here is a more precise definition of a function.19 A function consists of three pieces: 1. A set called the domain; (a) A set called the codomain; and 2. A mapping rule (or simply mapping or simply rule) which specifies how each and every element in the domain is mapped (or assigned) to one (and exactly one) element in the codomain. Remark 3. The codomain is not the same thing as the range. We’ll learn about the range only in the next section. Altogether then, a function simply maps (or assigns) each element in the domain to one (and exactly one) element in the codomain. Example 39. Let f be the function whose ... • Domain is the set {Cow, Chicken}; • Codomain is the set {Produces eggs, Produces milk, Guards the home}; and

• Mapping rule is, informally, “match the animal to its role”.

According to the mapping rule, “Cow” (in the domain) is mapped to “Produces milk” (in the codomain) and “Chicken” (in the domain) is mapped to “Produces eggs” (in the codomain). Every element in the domain is mapped to exactly one element in the codomain. This is indeed a function, because it has a domain, codomain, and a correctly-specified mapping rule.

19

This definition is still informal. See Definition 137 in the Appendices for the exact, formal definition (optional).

Page 57, Table of Contents

www.EconsPhDTutor.com

Example 40. Let f be the function whose ... • Domain is the set {1, 2}; • Codomain is the set {1, 2, 3, 4, 5}; and

• Mapping rule is, informally, “multiply by 2”.

According to the mapping rule, “1” (in the domain) is mapped to “2” (in the codomain) and “2” (in the domain) is mapped to “4” (in the codomain). Every element in the domain is mapped to exactly one element in the codomain. This is indeed a function, because it has a domain, codomain, and a correctly-specified mapping rule.

Example 41. Let f be the function whose ... • Domain is the set R; • Codomain is the set R; and

• Mapping rule is, informally, “round off to the nearest integer, where half-integers are rounded up” According to the mapping rule, “3” (in the domain) is mapped to “3” (in the codomain), “3.14159” (in the domain) is mapped to “3” (in the codomain), “3.5” (in the domain) is mapped to “4” (in the codomain), and “3.88” (in the domain) is mapped to “4” (in the codomain). Every element in the domain is mapped to exactly one element in the codomain.

This is indeed a function, because it has a domain, codomain, and a correctly-specified mapping rule.

Page 58, Table of Contents

www.EconsPhDTutor.com

3.1

Formal Mathematical Notation for Functions

In general, the correct way to describe a function is: The function f ∶ D → C is defined by f ∶ x ↦ f (x) for all x ∈ D.

Or alternatively:

The function f ∶ D → C is defined by f (x) = ... for all x ∈ D.

This says that the function’s name is f , its domain is D, and its codomain is C. The last bit “f ∶ x ↦ f (x)” is the mapping rule and this mapping rule applies to “all x ∈ D” (all elements in the domain).

To save ourselves a bit of writing, if it’s clear from the context that we’re talking about the function f , then we’ll omit “f ∶” from the front of the mapping rule. Also, if the mapping rule applies universally to all elements of the domain, then we also omit the “for all x ∈ D” at the end. Altogether then, we will often simply write:

The function f ∶ D → C is defined by x ↦ f (x).

We will sometimes also denote the domain and codomain of f by Dom(f ) and Cod(f ).

Page 59, Table of Contents

www.EconsPhDTutor.com

Example 40 (revisited). In formal mathematical notation, we write: “the function f ∶ {1, 2} → {1, 2, 3, 4, 5} is defined by x ↦ 2x.” Or alternatively, “the function f ∶ {1, 2} → {1, 2, 3, 4, 5} is defined by f (x) = 2x.” This says that the function has:

• Name f ; • Domain {1, 2};

• Codomain {1, 2, 3, 4, 5}; and • Mapping rule: Map every element x in the domain to the element 2x in the codomain. Let’s examine this formal notation a little more closely. In the context of set-builder notation (Section 1.17), the mathematical punctuation mark colon “∶” stood for “such that”. However, in the context of functions, the colon “∶” stands instead for “from”. Unfortunately there are only so many symbols and punctuation marks, so invariably some symbols will have to play more than one role! The mathematical punctuation mark “→” (right arrow) simply stands for “to”. Altogether then, “f ∶ D → C” reads as “f is the function from domain D to codomain C”.

The mathematical punctuation mark “↦” stands for “maps to”. Hence, “x ↦ f (x)” reads as “x is mapped to f (x)”.

Page 60, Table of Contents

www.EconsPhDTutor.com

Example 39 (revisited). In formal mathematical notation, we write: “the function f ∶ {Cow, Chicken} → {Produces eggs, Produces milk, Guards the home} is defined by Cow ↦ Produces milk and Chicken ↦ Produces eggs.” Or alternatively, “the function f ∶ {Cow, Chicken} → {Produces eggs, Produces milk, Guards the home} is defined by f (Cow) = Produces milk and f (Chicken) = Produces eggs.” This says that the function has • Name f ;

• Domain {Cow, Chicken}; • Codomain {Produces eggs, Produces milk, Guards the home}; and

• Mapping rule: Map the element “Cow” in the domain to the element “Produces milk” in the codomain and the element “Chicken” in the domain to the element “Produces eggs” in the codomain. Example 41 (revisited). In formal mathematical notation, we write: “the function f ∶ R → R is defined by x ↦ Integer closest to x”. This says that the function has • Name f ; • Domain R;

• Codomain R; • Mapping rule: Map every element x in the domain to the closest integer in the codomain.

Students frequently believe that f (x) denotes a function. This is wrong. f and f (x) refer to two different things. f denotes a function. f (x) denotes the value of f at x.

This may seem like an excessively pedantic distinction. But maths is precise and pedantic. In maths, what we mean is precisely what we say and what we say is precisely what we mean. There is never any room for ambiguity or alternative interpretations. More examples:

Page 61, Table of Contents

www.EconsPhDTutor.com

Example 42. “The function f ∶ [0, 1] → R is defined by x ↦ 3x + 4.” Or alternatively, “The function f ∶ [0, 1] → R is defined by f (x) = 3x + 4.” This says that the function’s name is f , its domain is [0, 1] (the set of all reals between 0 and 1, including 0 and 1), its codomain is R (the set of all reals), and its mapping rule is that we map each element x in the domain to the element 3x + 4 in the codomain. The value of f at 0.5 is f (0.5) = 3(0.5) + 4 = 5.5. What is f (3)? It is not 3(3) + 4 = 13. This is because 3 is not in the domain of f . Hence, f (3) is simply undefined.

Example 43. “The function f ∶ R+ → R is defined by x ↦ ln x.” Or alternatively, “The function f ∶ R+ → R is defined by f (x) = ln x.” This says that the function’s name is f , its domain is R+ (the set of all positive reals), its codomain is R (the set of all reals), and its mapping rule is that we map each element x in the domain to the element ln x in the codomain. The value of f at 2 is f (2) = ln 2 ≈ 0.693.

f (0) is simply undefined, because 0 is not in the domain of f . Likewise, f (a) is undefined, for any a < 0.

Exercise 29. For each of the following functions, write down the value of the function at 1. (a) The function f ∶ R → R is defined by x ↦ x + 1. (b) The function g ∶ [−1, 1] → R is defined by x ↦ 17x. (c) The function h ∶ Z+ → R is defined by x ↦ 3x . (d) The function i ∶ Z− → R is defined by x ↦ 3x . (Answer on p. 936.)

Page 62, Table of Contents

www.EconsPhDTutor.com

3.2

EVERY x ∈ D Must be Mapped to EXACTLY ONE y ∈ C

This section simply repeats and emphasises what was already said above. Example 44. Say we have ... • Domain {Cow, Chicken}; • Codomain {Produces eggs, Produces milk, Guards the home}; and • Mapping rule: “Chicken” is mapped to “Produces eggs”.

Can we define a function using the above domain, codomain, and mapping rule? No. The reason is that the mapping rule fails to specify what “Cow” (an element of the domain) should be mapped to. It thus fails the requirement that every element in the domain be mapped to an element in the codomain. Example 45. Say we have ... • Domain {Cow, Chicken};

• Codomain {Produces eggs, Produces milk, Guards the home}; and

• Mapping rule: “Cow” is mapped to both “Produces milk” and “Guards the home”; and “Chicken” is mapped to “Produces eggs”. Can we define a function using the above domain, codomain, and mapping rule? No. The reason is that the mapping rule maps “Cow” (an element of the domain) to more than one element in the codomain. It thus fails the requirement that every element in the domain be mapped to exactly one element in the codomain. Example 46. Say we have ... • Domain R; • Codomain [0, 1]; and

• Mapping rule: x ↦ x + 1.

Can we define a function using the above domain, codomain, and mapping rule? No. The reason is that the mapping rule fails to map some elements in the domain (e.g. 14) to any element in the codomain. It thus fails the requirement that every element in the domain be mapped to an element in the codomain.

Page 63, Table of Contents

www.EconsPhDTutor.com

Example 47. Say we have ... • Domain R;

• Codomain R; and • Mapping rule: x ↦ ±x.

Can we define a function using the above domain, codomain, and mapping rule? No. The reason is that the mapping rule maps each element in the domain (e.g. 14) to more than one element in the codomain (+14 and -14). It thus fails the requirement that every element in the domain be mapped to exactly one element in the codomain. For Exercises 30-37: (i) State (yes/no) whether we can define a function using the given domain, codomain, and rule. (ii) Explain why or why not. (iii) If we can, then write down the function in formal notation. Exercise 30. Let the domain be {5, 6, 7}, the codomain be Z+ , and the mapping rule be x ↦ 2x (Answer on p. 936.)

Exercise 31. Let the domain be {0, 3}, the codomain be {3, 4}, and the mapping rule be (informally) “any larger number will work”. (Answer on p. 936.)

Exercise 32. Let the domain be {2, 4}, the codomain be {3, 4}, and the mapping rule be (informally) “any smaller number will work”. (Answer on p. 936.) Exercise 33. Let the domain be {1}, the codomain be {1}, and the mapping rule be (informally) “stay exactly the same”. (Answer on p. 936.) Exercise 34. Let the domain be {1}, the codomain be {1, 2}, and the mapping rule be (informally) “stay exactly the same”. (Answer on p. 936.)

Exercise 35. Let the domain be {1, 2}, the codomain be {1}, and the mapping rule be (informally) “stay exactly the same”. (Answer on p. 936.) √ Exercise 36. Let the domain be R, the codomain be R, and the mapping rule be x ↦ x. (Answer on p. 936.) 1 Exercise 37. Let the domain be R, the codomain be R, and the mapping rule be x ↦ . x (Answer on p. 936.)

Exercise 38. How might you change the domain in Exercise 36 so that a function can be defined? (Answer on p. 937.) Exercise 39. How might you change the domain in Exercise 37 so that a function can be defined? (Answer on p. 937.)

Page 64, Table of Contents

www.EconsPhDTutor.com

3.3

Real-Valued Functions of a Real Variable

Definition 13. A function of a real variable is any function whose domain is a subset of R. Definition 14. A real-valued function is any function whose codomain is a subset of R.

Altogether then, a real-valued function of a real variable is any function both of whose domain and codomain are subsets of R. Example 48. Consider the functions f ∶ R → R, g ∶ R → R, and h ∶ R → R defined by x ↦ x2 . All three are real-valued functions, functions of a real variable, and thus also real-valued functions of a real variable. Consider the function i ∶ {Cow, Chicken} → Z defined by Cow ↦ 5 and Chicken ↦ 32. This is a real-valued function, but not a function of a real variable. Thus, it is not a real-valued function of a real variable. Consider the function j ∶ Z → {Cow, Chicken} defined by x ↦ Cow if x is odd and x ↦ Chicken if x is even. This is a function of a real variable, but not a real-valued function. Almost all functions considered in H2 Maths are real-valued functions of a real variable. So we’ll see plenty of functions like f , g, and h from the above example, but rarely (if ever) will we see functions like i or j. In this textbook, unless otherwise clearly-stated, it may be assumed that all functions are real-valued functions of a real variable.

Page 65, Table of Contents

www.EconsPhDTutor.com

3.4

The Range of a Function

Informally, the range of a function f ∶ D → C — denoted f (D) — is the set of elements in the codomain C that are “hit” by the function. Formally: Definition 15. The range of a function f ∶ D → C is f (D) = {y ∈ C ∶ There is x ∈ D such that f (x) = y}.

The range of f may be denoted Range(f ) or f (D) (if D is the domain of f ).

The range is not the same thing as the codomain. Because this is such a common misconception, let me repeat:

♡ The range is not the same thing as the codomain. ♡

Indeed, the range is usually a proper subset of the codomain, as was the case in each of the following examples. Example 49. Define f ∶ [0, 1] → R by x ↦ x + 1. Then Range(f ) = f ([0, 1]) = [1, 2].

Example 50. Define f ∶ {2, 3} → R by x ↦ x + 1. Then Range(f ) = f ({2, 3}) = {3, 4}. Example 51. Define f ∶ R → R by x ↦ ex . Then Range(f ) = f (R) = R+ .

The range is often is a proper subset of the codomain, but sometimes they can be equal:

Example 52. Define f ∶ R+ → R by x ↦ ln x. Then Range(f ) = f (R+ ) = R = Cod(f ). Exercise 40. Let the function f ∶ R+0 → R be defined by x ↦ f ? (Answer on p. 937.)

√ x. What is the range of

Exercise 41. Let the function f ∶ Z → Z be defined by x ↦ x2 . What is the range of f ? (Answer on p. 937.)

Exercise 42. Which of the following statements is/are true? (a) “The range of any function is a subset of its domain.” (b) “The range of any function is a subset of its codomain.” (c) “The range of any function is a proper subset of its codomain.” (Answer on p. 937.)

Page 66, Table of Contents

www.EconsPhDTutor.com

3.5

Creating New Functions

Let f ∶ A → R and g ∶ B → R be functions with mapping rules f ∶ x ↦ f (x) and g ∶ x ↦ g(x). Let k ∈ R be a constant. Then we can create the function f + g in the “obvious” fashion. f We can also create the functions f − g, f ⋅ g, , and kf in the “obvious” fashions.20 g

The symbol ⋅ is an alternative symbol for multiplication. We will often prefer using ⋅ rather than × because there is the slight risk of confusing × with the letter x.

As we shall see, f g shall refer to a function that is entirely different from f ⋅ g, so we must really be careful to write f ⋅ g when that is what we mean.

Example 53. Let f ∶ R → R be defined by x ↦ 7x + 5 and g ∶ R → R be defined by x ↦ x3 . Let k = 2. Then

• f + g is the function with domain R, codomain R, and mapping rule x ↦ 7x + 5 + x3 ; • f − g is the function with domain R, codomain R, and mapping rule x ↦ 7x + 5 − x3 ;

• f ⋅ g is the function with domain R, codomain R, and mapping rule x ↦ (7x + 5) x3 ; and f • is the function with domain R− ∪ R+ , codomain R, and mapping rule x ↦ (7x + 5) /x3 . g • kf is the function with domain R, codomain R, and mapping rule x ↦ 2 (7x + 5). We can of course give these four new functions new names (perhaps a single-letter name for each), but this is not necessary. We can simply write: (f + g) (1) = 7(1) + 5 + 13 = 13,

(f − g) (1) = 7(1) + 5 − 13 = 11, (kf )(1) = 2 [7 (1) + 5] = 24,

(f ⋅ g) (1) = [7 (1) + 5] (1)3 = 12, f ( ) (1) = [7 (1) + 5] /13 = 12, g

where the pairs of parentheses around each of the five new functions are just to be clear that we are talking about a single, fully-fledged function. 20

Formally, f + g is the function with domain A ∩ B, codomain R, and mapping rule x ↦ f (x) + g(x). Similarly, f − g is the function with domain A ∩ B, codomain R, and mapping rule x ↦ f (x) − g(x). f ⋅ g is the function with domain A ∩ B, codomain R, and mapping rule x ↦ f (x)g(x). f is the function with domain {x ∶ x ∈ A ∩ B, g(x) ≠ 0}, codomain R, and mapping rule x ↦ f (x)/g(x). The set A ∩ g B/ {x ∶ g(x) = 0} is the set of all elements x that are in both A and B, excluding those for which g(x) = 0. This exclusion is necessary, otherwise f (x)/g(x) may sometimes not be well-defined. Finally, kf is simply the function with domain A, codomain R, and mapping rule x ↦ kf (x).

Page 67, Table of Contents

www.EconsPhDTutor.com

3.6

One-to-One Functions

Informally, a function is one-to-one (or invertible) if every element in its range is “hit” exactly once (by exactly one element in the domain). Put another way: every element y in the range corresponds to exactly one element in the domain. Formally: Definition 16. A function f ∶ D → C is one-to-one (or invertible) if for every y ∈ f (D), there is only one x ∈ D such that f (x) = y.

Example 54. Consider the function f whose domain is the set {Cow, Chicken}, codomain is the set {Produces eggs, Produces milk, Guards the home}, and mapping rule is Cow ↦ Produces milk and Chicken ↦ Produces eggs. The range is {Produces eggs, Produces milk}.

This function is one-to-one because each element in the range is “hit” exactly once, as we can easily verify: Produces eggs is “hit” once by Chicken and Produces milk is “hit” once by Cow. Example 55. Let f ∶ [0, 1] → R be defined by x ↦ x + 1. The range of f is [1, 2].

To check whether this function is one-to-one, we need to show that every element y in the range corresponds to exactly one element x in the codomain. To this end, let’s pick any element y in the range and write: y = x + 1 ⇐⇒ y − 1 = x.

Thus, indeed, this function is one-to-one — every element y in the range corresponds to exactly one element y − 1 in the domain. To show that a function is not one-to-one, simply give a counter-example:

Example 56. Let f ∶ R → R be defined by x ↦ x2 . The range of f is R+ .

This function is not one-to-one — for example, the element 9 in the range is “hit” twice, once by −3 and again by 3. Remark 4. One-to-one or invertible functions are also known as injective functions (or simply injections), but we won’t use this term in this textbook.

Exercise 43. State and explain √ whether each of the following functions is one-to-one. + (a) f ∶ R0 → R is defined by x ↦ x. (b) g ∶ R+0 → R is defined by x ↦ x2 . (c) h ∶ R → R is defined by x ↦ ∣x∣. (d) i ∶ R+0 → R is defined by x ↦ ∣x∣. (e) j ∶ R → R is defined by x ↦ sin x. (Answer on p. 938.) Page 68, Table of Contents

www.EconsPhDTutor.com

3.7

Inverse Functions

Definition 17. If f ∶ D → C is invertible, then its inverse function f −1 ∶ f (D) → D is defined by the mapping rule “f −1 (y) = x ⇐⇒ y = f (x)”. Only invertible functions have inverse functions. If a function is not invertible, then its inverse function simply does not exist.

Given a one-to-one (or invertible) function f , to find its inverse function f −1 , follow these steps: 1. Dom (f −1 ) = Range (f ). 2. Cod (f −1 ) = Dom (f ).

3. Write down an expression f −1 (y) that involves only y and show that “f −1 (y) = x ⇐⇒ y = f (x) ”.

Page 69, Table of Contents

www.EconsPhDTutor.com

Example 57. As we showed above, the function f ∶ [0, 1] → R defined by x ↦ x + 1 is one-to-one. So its inverse function f −1 exists. Let’s find it. 1. f has range f −1 has domain 2]. Example 57. [1, As 2]. we So showed above, the [1, function f ∶ [0, 1] → R defined by x ↦ x + 1 is −1 −1 2. f has domain 1]. Sofunction f has fcodomain 1]. find it. one-to-one. So its[0, inverse exists. [0, Let’s

3. 1. 2. 3.

Pick any element y in the range of f and write: f has range [1, 2]. So f −1 has domain [1, 2]. y = x [0, + 11]. ⇐⇒ y − 1 = x. f has domain [0, 1]. Soy f=f−1(x) has⇐⇒ codomain ± Pick any element y in the range of f and write: f −1 (y)

=fy(x) y − 1 = x. So f −1 has mapping rule yy ↦ − 1.⇐⇒ y = x + 1 ⇐⇒ ± −1 f

(y)

We’ll actually only formally talk about graphs in the next few chapters. But for now, as a So f −1aid, has I’ll mapping ↦ y − 1.of f −1 (blue) and f (red) anyway. visual providerule theygraphs We’ll actually only formally talk about graphs in the next few chapters. But for now, as a −1 visual aid, I’llfprovide the graphs of f −1 (blue) (red) Observe that is simply the reflection of f inand thefline y = anyway. x (dotted). Section 7.2 (in −1 Observe that f is simply the reflection of f in the line y = x (dotted). Section 7.2 (in particular Fact 7) will explain why exactly this is so. particular Fact 7) will explain why exactly this is so.

Page 70, Table of Contents

www.EconsPhDTutor.com

Page 70, Table of Contents

www.EconsPhDTutor.com

Example 58. You can verify for yourself that the function f ∶ R → R defined by x ↦ 2x is −1. one-to-one. Let’s find its inverse function f −1 1. 2. 3.

−1 has domain R. f has range R. So f −1 f has domain R. So f −1 has codomain R. As usual, let’s pick any element y in the range of f and write:

y =f (x) ⇐⇒ y = 2x ⇐⇒ 0.5y = x. ± −1 −1 f

−1 has mapping rule y ↦ 0.5y. So f −1

Page 71, Table of Contents Page

(y)

www.EconsPhDTutor.com

Example 59. You can verify for yourself that the function f ∶ R+ ∪ R− → R defined by 1 x ↦ is one-to-one. Let’s findfor its yourself inverse function . Example 59. You can verify that thef −1function f ∶ R+ ∪ R− → R defined by 1x

x↦

is one-to-one. Let’s find−1its inverse function f −1 . + − + −

1. xf has range R ∪ R . So f domain+ R+ ∪− R− .

has domain R ∪ R .

+ − codomain hashas domain R+ ∪RR−∪. R .

So−1f −1

f has 1. 2. f has range R ∪ R . So f

− + f and As usual, let’s element y in the rangeRof 2. 3. f has domain R+pick ∪ Rany . So f −1 has codomain ∪ R− . write: 1 range 1of f and write: 3. As usual, let’s pick yany element y in the ⇐⇒ =f (x) ⇐⇒ y = = x (∵ y ≠ 0).

x

y

1 1 ® ⇐⇒ f −1 (y) = x y =f (x) ⇐⇒ y = x y ® −1

1 So f −1 has mapping rule y ↦ . y

f

(y)

(∵ y ≠ 0).

1 that ∵ is therule shorthand So(Note f −1 has mapping y ↦ .symbol for because. Similarly, ∴ is the shorthand symbol y for therefore.) (Note that ∵ is the shorthand symbol for because. Similarly, ∴ is the shorthand symbol forThe therefore.) condition here that y ≠ 0 is important and goes back to our warning that was Chapter The condition here that We y ≠ 0know is important and the goesrange backoftofour warning that was Chapter 2 (Dividing by Zero). for sure that does not contain 0. This is 2 (Dividing We know forsafely sure divide that the range does not contain why in theby lastZero). line above, we can both sidesof of fthe equation by y. 0. This is why in the last line above, we can safely divide both sides of the equation by y.

Page 72, Table of Contents

Page 72, Table of Contents

www.EconsPhDTutor.com

www.EconsPhDTutor.com

Example 60. You can verify for yourself that the function f ∶ R+0 → R defined by x ↦ x2 is one-to-one. find verify its inverse functionthat f −1 .the function f ∶ R+0 → R defined by x ↦ x2 Example 60.Let’s You can for yourself is one-to-one. Let’s find −1 its inverse function f −1 . + + 1. f has range R0 . So f has domain R0 . −1 1. ff has has domain range R+0R. +0 So R+0 . R+0 . 2. . Sof f −1has hasdomain codomain + + 2. As f has domain f −1 has codomain 0 . So 0. 3. usual, let’s R pick any element y in the R range of f and write: 3. As usual, let’s pick any element y in the range of f and √ write: y =f (x) ⇐⇒ y = x2 ⇐⇒ ± y = x. √ y = x. y =f (x) ⇐⇒ y = x2 ⇐⇒ ± ± −1 f (y) ± f −1 (y) √ √ Here there are two possibilities for the mapping rule of f −1 , namely y → √y and y → −√y. −1 − y. Heremust therepick are one. two possibilities for the the domain mappingofrule of f −1hence , namely → y and We We know that f —and the ycodomain of yf→ — is √ + must pick one. We know that the domain of−1f —and hence the codomain of f −1 — is We R0 . So we should pick as the mapping rule of f ∶ y → √y. R+0 . So we should pick as the mapping rule of f −1 ∶ y → y.

+ Exercise 44. Find the inverse inverse function functionfor foreach eachofofthe thefollowing followingfunctions. functions.(a) (a)f f∶ R∶ +0R→ RR 0 → √ √ the defined by xx ↦ ↦ x. x. (b) (b) gg ∶∶[−0.5π, [−0.5π,0.5π] 0.5π]→ →RRdefined definedby byxx↦↦sin sinx.x.(c) (c)hh∶ R ∶ R→→RRdefined definedby 3 3 x on p. by↦xx↦.(Answers x .(Answers on 939.) p. 939.)

Page 73, Table of Contents

www.EconsPhDTutor.com

Page 73, Table of Contents

www.EconsPhDTutor.com

3.8

Domain Restriction to Create an Invertible Function

We saw that some functions were not one-to-one (or non-invertible). And so for these functions, an inverse function simply does not exist. Nonetheless, we can often transform a non-invertible function into an invertible function. One way to do this is by restricting the domain. The new invertible function will then have an inverse function. Example 61. We saw in Exercise 43 that the function j ∶ R → R defined by x ↦ sin x was not one-to-one. However, we can restrict the domain to [−0.5π, 0.5π] to get a brand new function g ∶ [−0.5π, 0.5π] → R defined by x ↦ sin x. This brand new function g is identical to the original function j except for its domain. g is one-to-one, as you should verify for yourself. We can thus go ahead and construct the inverse function g −1 . Actually, we already did this in Exercise 44. Example 62. We saw in Example 56 that the function f ∶ R → R defined by x ↦ x2 was not one-to-one. However, we can restrict the domain to R+0 to get a brand new function g ∶ R+0 → R defined by x ↦ x2 . This brand new function g is identical to the original function f except for its domain. g is one-to-one, as we verified in Exercise 43. We can thus go ahead and construct the inverse function g −1 . I leave this as an exercise for you.

There is almost always more than one way to restrict the domain of a non-invertible function to obtain an invertible function. Indeed, a trivial case would be where we restrict its domain to be the empty set! In which case the function thus formed would certainly be invertible, though not very interesting (it would have an empty domain and an empty range — so too would its inverse function). 1 (x − 1)2 is not one-to-one. (b) Show that by restricting its domain to (1, ∞), we can create a new invertible function g (you must prove that this new function is invertible). (c) Then find the inverse function g −1 . (Answer on p. 940.) Exercise 45. (a) Show that the function f ∶ (−∞, 1)∪(1, ∞) → R defined by x ↦

Exercise 46. For the function f in Example 62, let’s instead restrict the domain to [20, 30]. Show that the new function thus obtained is one-to-one and find its inverse. (Answer on p. 940.)

Page 74, Table of Contents

www.EconsPhDTutor.com

3.9

Composite Functions

Definition 18. Let f and g be functions such that the range of g is a subset of the domain of f . Then the composite function f g is the function with the same domain as g, the same codomain as f , and mapping rule x ↦ f (g(x)).

The composite function f g can be read aloud as “f circle g” and is sometimes denoted f ○ g, especially when we want to make clear that we are not talking about f ⋅ g. But we’ll rarely use the f ○ g notation, unless there is some risk of confusion with f ⋅ g.

The underlined condition is important: The range of g must be a subset of the domain of f in order for the composite function f g to exist. This condition ensures that given any x from the domain of g, the value g(x) is itself also in the domain of f , so that f (g(x)) is well-defined. If this condition fails, then the composite function f g simply does not exist.

Example 63. The functions g, f ∶ R → R are defined by g ∶ x ↦ x + 1 and f ∶ x ↦ 2x. The range of g is R — this is indeed a subset of the domain of f (which is R). So the composite function f g ∶ R → R exists and is defined by x ↦ f (g(x)) = 2 (g(x)) = 2(x + 1).

Let’s try computing f g(2). We can use the definition of a composite function: f g(2) = f (g(2)) = f (2 + 1) = f (3) = 6. Alternatively, we can directly use f g(x) = 2(x + 1) to compute f g(2) = 2(2 + 1) = 6.

Notice that for the composite function f g, we apply the function g first before applying the function f . So for example, to compute, say f g(7), we compute g(7) first, then compute f (g(7)). (A common mistake by students is to instinctively read from left to right, and so apply f first before g.)

Example 64. The functions g, f ∶ R → R are defined by g ∶ x ↦ x2 and f ∶ x ↦ x + 1. The range of g is R+0 — this is indeed a subset of the domain of f (which is R). So the composite function f g ∶ R → R exists and is defined by x ↦ f (g(x)) = g(x) + 1 = x2 + 1. Let’s try computing f g(3). We can use either the definition of a composite function: f g(3) = f (g(3)) = f (32 ) = f (9) = 10. Alternatively, we can directly compute, using f g(x) = x2 + 1: f g(3) = 32 + 1 = 10. Page 75, Table of Contents

www.EconsPhDTutor.com

Example 65. The function g ∶ R → R is defined by x ↦ x + 1. The function f ∶ R+ → R is defined by x ↦ ln x. The range of g is R, which is not a subset of the domain of f (which is R+ ). Hence, the composite function f g simply does not exist. We saw that if f is non-invertible, then its inverse function f −1 simply does not exist. Nonetheless, we could restrict its domain to create a new invertible function g, whose inverse function g −1 we could then write down.

By analogy, suppose we have functions f and g where g’s range is not a subset of f ’s domain. Thus, the composite function f g simply does not exist. But we can play a similar trick: We can restrict the domain of g to create a new function gˆ, so that the range of gˆ is a subset of f ’s domain. We can then write down the composite function f gˆ. Fortunately, this is not in the syllabus, so you don’t need to know how to do this. Yay! Exercise 47. For each of the following pairs of functions f and g, verify that the composite function f g exists and write it out in full. Also, compute f g(1) and f g(2). (a) The functions g, f ∶ R → R defined by g ∶ x ↦ x2 + 1 and f ∶ x ↦ ex . (b) The functions g, f ∶ R → R defined by g ∶ x ↦ ex and f ∶ x ↦ x2 + 1. (c) The functions g, f ∶ R− ∪ R+ → R defined by g ∶ x ↦ 1/2x and f ∶ x ↦ 1/x. (d) The functions g, f ∶ R− ∪ R+ → R defined by g ∶ x ↦ 1/x and f ∶ x ↦ 1/2x. (Answer on p. 941.)

We can of course also build a composite function out of a single function.

Example 66. The function f ∶ R → R is defined by x ↦ 2x. The range of f is R and this is indeed a subset of the domain of f (which is R). So the composite function f f ∶ R → R exists and is defined by x ↦ f (f (x)) = 2f (x) = 2(2x) = 4x. And so for example f f (3) = 2(2 × 3) = 12. The composite function f f can instead be written as f 2 . So in the above example, we’d write f 2 (3) = 12.

We can, analogously, define the composite function f f 2 and denote it f 3 . Using the above example, f 3 (x) = 8x and f 3 (3) = 24. Of course, there are also f 4 , f 5 , etc.

Page 76, Table of Contents

www.EconsPhDTutor.com

Remark 5. The official A-level syllabus uses f 2 to mean the composite function f f and nothing else. So this is what we’ll do in this textbook. But confusingly enough, some writers use the symbol f 2 to mean “the second derivative of f ”, f 3 to mean “the third derivative of f ”, etc.. We won’t follow such practice. Just to let you know, in case you read other mathematical texts and get confused. However, we will use f (3) to mean “the third derivative of”, f (4) to mean “the fourth derivative of”, etc. This will show up occasionally in Part V (Calculus). Exercise 48. For each of the following functions f , verify that the composite function f 2 exists and write it out in full. Also, compute f 2 (1) and f 2 (2). (a) The function f ∶ R → R defined by x ↦ ex . (b) The function f ∶ R → R defined by x ↦ 3x + 2. (c) The function f ∶ R → R defined by x ↦ 2x2 + 1. (Answer on p. 941.)

Page 77, Table of Contents

www.EconsPhDTutor.com

4

Graphs

An ordered pair is a mathematical object. Like a set of two objects, an ordered pair is, informally, a “container” with two objects, where the objects are listed out with a comma separating them. The only difference between a set of two objects and an ordered pair is that order matters for the latter. To distinguish an ordered pair from a set of two objects, we use parentheses (instead of braces). Example 67. (Cow, Chicken) is an ordered pair. (−5, 4) is an ordered pair.

We also refer to (a, b) as ordered set notation. So (Cow, Chicken) and (−5, 4) are both examples of ordered pairs, written out in ordered set notation.

Example 68. Let (Cow, Chicken) and (Chicken, Cow) be ordered pairs. Let {Cow, Chicken} and {Chicken, Cow} be sets. Recall that for sets, order did not matter. Hence, {Cow, Chicken} = {Chicken, Cow}.

In contrast, for ordered pairs, order does matter. And so (Cow, Chicken) ≠ (Chicken, Cow).

Definition 19. An ordered pair of real numbers is any (x, y) where both x, y are real. Example 69. (−5, 4), (1, 1), and (2, −3) are all ordered pairs of real numbers.

Confusingly, above in Section 1.9 (Intervals), we said that (−5, 4) was a set, namely {x ∈ R ∶ −5 < x < 4}. Here we say instead that (−5, 4) is an ordered pair, consisting of two objects (−5 and 4), the order of which matters.

Unfortunately this is yet another bit of confusing notation you’ll have to live with. You’ll have to learn to tell, from the context, whether (−5, 4) is a set of infinitely-many real numbers or an ordered pair. But don’t worry, this is usually pretty obvious.

Page 78, Table of Contents

www.EconsPhDTutor.com

Definition 20. In any ordered pair of real numbers, the first real is called the x-coordinate and the second is the y-coordinate. Definition 21. The cartesian plane is the set of all ordered pairs of real numbers.

In set-builder notation, the cartesian plane can be written as {(x, y) ∶ x ∈ R, y ∈ R}. This reads aloud as “the cartesian plane is the set of ordered pairs of real number (x, y)”.

In this textbook, we’ll usually only ever look at ordered pairs of real numbers. Hence, rather than say “ordered pair of real numbers”, we’ll simply say “ordered pair”. And so whenever you see the notation (x, y), it should be understood that this is an ordered pair of real numbers (and not cows or chickens).

And so instead of writing the cartesian plane as {(x, y) ∶ x ∈ R, y ∈ R}, we’ll simply write it as {(x, y)}, with the understanding that x, y are reals.

In the present context, we’ll also simply call any ordered pair of real numbers a point. (Later on, in the context of three-dimensional geometry, points will also refer to ordered triples of real numbers.) Definition 22. In the context of the cartesian plane, the origin is the point (0, 0).

Points are usually given lower-case letters as names.

Page 79, Table of Contents

www.EconsPhDTutor.com

We can illustrate the cartesian plane graphically. The horizontal axis corresponds to the We can illustrate the cartesian plane graphically. The horizontal axis corresponds to the x-coordinate of the points and is thus also called the x-axis. The vertical axis corresponds x-coordinate of the points and is thus also called the x-axis. The vertical axis corresponds to the y-coordinate of the points and is thus also called the y-axis. to the y-coordinate of the points and is thus also called the y-axis. Example Example c = (2, −3) c = (2, −3)

70. The points (or ordered pairs of real numbers) a = (−5, 4), b = (1, 1), and 70. The points (or ordered pairs of real numbers) a = (−5, 4), b = (1, 1), and are illustrated graphically on the cartesian plane: are illustrated graphically on the cartesian plane:

Definition 23. A graph (or curve) is any set of points. Example 71. The set of three points {a, b, c} = {(−5, 4), (1, 1), (2, −3)} is a graph.

Page 80, Table of Contents Page 80, Table of Contents

www.EconsPhDTutor.com www.EconsPhDTutor.com

The graph of a function f is simply the set of points (x, y) that satisfy x ∈ D, y ∈ C, and f (x) = y. Formally: Definition 24. The graph of a function f ∶ D → C is the set {(x, y) ∶ x ∈ D, y ∈ C, y = f (x)}.

Given a function that is named with a lower-case letter, we will often use the upper-case version of that same letter to denote that function’s graph. So for example, given the function f , we often give its graph the name F . Example 72. Consider the function f ∶ R → R defined by x ↦ x2 . Its graph may be written as F = {(x, y) ∶ y = x2 }.

We’ve defined graph as a noun. But at the slight risk of confusion, we’ll also use it as a verb that means “draw in the cartesian plane a given set of points”. So we can say either “we draw the graph of f ” (graph as a noun), or “we graph f ” (graph as a verb). Page 81, 81, Table Table of of Contents Contents Page

www.EconsPhDTutor.com www.EconsPhDTutor.com

Definition 25. Given a function f ∶ D → C, a point of the function is any element of its graph. That is, it is any ordered pair (x, f (x)), where x ∈ D. To use the above example, we say that (2, 4) and (5, 25) are both points of f .

But since x determines f (x), it is nice but not necessary to specify the complete ordered pair (x, f (x)). Instead, we can refer to the point simply as x. So in the above example, we can simply say that “2 and 5 are both points of f ”, with the understanding that what we really mean is “(2, 4) and (5, 25) are both points of f ”. This is a bit sloppy and at the risk of some confusion, but will save us a lot of messy notation. So in the context of functions, x does double duty. It can either refer to an element in the function’s domain OR it can refer to a point of the function.

On exams though, it is probably safer to simply list out the full co-ordinates, whenever you’re referring to a point. Just in case your marker is damn niao.

We just learnt about the graph of a function. A graph of an equation is similarly defined:

Page 82, Table of Contents

www.EconsPhDTutor.com

Definition 26. Given an equation involving x and y as the only two variables, the graph of the equation is the set of points (x, y) that satisfy the equation. Example 73. The graph of the equation x2 + y 2 = 1 is simply the set {(x, y) ∶ x2 + y 2 = 1}.

𝑦

𝑝 = (𝑥, 𝑦) 𝑥

𝑥2 + 𝑦2 = 1 𝑦

1

𝑦 𝑥

-1.0

-0.5

0.0

0.5

1.0

Exercise 49. (a) Can the equation x2 + y 2 = 1 be rewritten into the form of a single function? (b) Can it be rewritten into the form of two functions? (Answer on p. 942.)

Exercise 50. Draw the graphs of each of the following equations. (a) y = ex . (b) y = 3x+2. (c) y = 2x2 + 1. (Answers on pp. 942, 943, and 944.)

Page 83, Table of Contents

www.EconsPhDTutor.com

4.1

Graphing with Your TI84 Graphing Calculator

Example 74. Graph the function f ∶ R → R defined by x ↦ x2 . 1. Press ON to turn on your calculator.

2. Press Y= to bring up the Y= editor. 3. Press X,T,θ,n to enter “X”; then x2 to enter the squared “2 ” symbol. 4. Now press GRAPH and the calculator will graph y = x2 . After Step 1.

After Step 2.

After Step 3.

After Step 4.

Example 75. Graph the equation x2 + y 2 = 1.

The TI84 requires that we enter equations in a form where y is directly expressed in terms of x. But (as we saw in Exercise 49), there is no way to rewrite the equation x2 + y 2 = 1 so that y is expressed as a single function in terms of √ x. So we’ll have to tell the TI84 to √ 2 graph two separate equations: y = 1 − x and y = − 1 − x2 . 1. Press ON to turn on your calculator.

2. Press Y= to bring up the Y= editor.

Most buttons on the TI84 have three different roles. Simply pressing a button executes the role printed on the button itself. Pressing the blue 2ND and then a button executes the role printed in blue above the button. And pressing the green ALPHA and then a button executes the role printed in green above the button. √ (which corresponds to the x2 button) to enter 3. Press the blue 2ND button and then √ √ “ (”. Next press 1 − X,T,θ,n x2 ) . Altogether you’ve entered 1 − x2 . 4. Now press ENTER and the blinking cursor will move down, to the right of “Y2 =”. After Step 1.

After Step 2.

After Step 3.

After Step 4.

(... Example continued on the next page ...) Page 84, Table of Contents

www.EconsPhDTutor.com

(... Example continued from the previous page ...) We’ll now enter the second equation. 5. Press the (-) button. (Warning: This is different from the - button. If you use the button, you will get an error message when you try to generate your graphs later.) Now √ repeat what we did in step 3 above: Press the blue 2ND button and then (which √ corresponds to the x2 button) to enter “ (”. Next press 1 − X,T,θ,n x2 ) to enter √ “1 − X 2 )”. Altogether you will have entered − 1 − x2 . √ √ 6. Now press GRAPH and the calculator will graph both y = 1 − x2 and y = − 1 − x2 .

Notice the graphs are very small. To zoom in:

7. Press the ZOOM button to bring up a menu of ZOOM options. 8. Press 2 to select the Zoom In option. Nothing seems to happen. But now press ENTER and the TI84 will zoom in a little for you. We expected to see a perfect circle. Instead we get an elongated oval – what’s going on? The reason is that by default, the x- and y- axes are scaled differently. To set them to the same scale: 9. Press the ZOOM button again to bring up the ZOOM menu of options. Press 5 to select the ZSquare option. Nothing seems to happen. But now press ENTER and the TI84 will adjust the x- and y- axes to have the same scale. And now we have a perfect circle. After Step 5.

After Step 6.

After Step 8.

After Step 9.

Page 85, Table of Contents

After Step 7.

www.EconsPhDTutor.com

5

Quick Revision: Exponents, Surds, Absolute Value

Here’s a super quick revision of some O-Level Maths we’ll be using. If you have severe difficulty with these exercises, you should go back and review your O-Level Maths material!

5.1

Laws of Exponents

For all real numbers x, we have x1 = x and x0 = 1.21

For all real numbers x, y, a, and b (provided any denominators are non-zero): x ⋅x a

b

=x

a+b

x a xa ( ) = a, y y

,

xa = xa−b , b x

(xa )

b

x−a =

= xab ,

a1/b =

(xy)a = xa y a ,

ac/b =

1 , xa

√ b a,

√ √ c b ac = ( b a) .

Exercise 51. (Answer on p. 945.) Simplify the two expressions below. (53x ⋅ 251−x ) , 52x+1 + 3(25x ) + 17(52x )

(8x+2 − 34(23x )) . √ 2x+1 ( 8)

Exercise 52. (Answer on p. 946.) Is each of the following true? (If true, explain why. If false, simply give a counterexample.) (i) x(a ) = xab ; b

21

(ii) (xa ) = xab . b

By convention, 00 is usually defined to be equal to 1 – this textbook will follow this practice.

Page 86, Table of Contents

www.EconsPhDTutor.com

5.2

Rationalising the Denominator of a Surd

Example 76. Here’s a case where there’s just a surd in the denominator: √ √ 1 2 2 √ =√ √ = . 2 2 2× 2 For more complicated cases, the trick is to use the fact that (a + b)(a − b) = a2 − b2 .

Given a + b, we call a − b the conjugate of a + b. We refer to a + b and a − b as a conjugate pair.

√ √ √ √ 1 1− 2 1− 2 1− 2 √ 1− 2 √ = √ √ = Example 77. = = = 2 − 1. √ 1−2 −1 1 + 2 (1 + 2) (1 − 2) 12 − ( 2)2 Exercise 53. (Answer on p. 946.) Prove the following equality:

x y

Page 87, Table of Contents

+

1 √

x2 y2

+1

=

√

x x2 +1− . 2 y y

www.EconsPhDTutor.com

5.3

Absolute Value

The notation ∣z∣ returns the absolute value of z. ⎧ ⎪ ⎪ ⎪z, ∣z∣ = ⎨ ⎪ ⎪ ⎪ ⎩−z,

(See footnote for a more formal definition.22 )

if z ≥ 0, if z < 0.

√ √ √ √ Example 78. ∣4∣ = 4 and ∣−4∣ = 4. ∣ 2∣ = 2 and ∣− 2∣ = 2. Fact 2. Let b ≥ 0.

(a) ∣x∣ < b ⇐⇒ −b < x < b.

(b) ∣x∣ ≤ b ⇐⇒ −b ≤ x ≤ b.

Proof. (a) ∣x∣ < b ⇐⇒ “0 ≤ x < b OR −b < x < 0” ⇐⇒ −b < x < b. (b) Very similar.

Fact 3. Let b ≥ 0.

(a) ∣x − a∣ < b ⇐⇒ a − b < x < a + b. (b) ∣x − a∣ ≤ b ⇐⇒ a − b ≤ x ≤ a + b.

Proof. (a) By Fact 2, ∣x − a∣ < b if and only if −b < x − a < b. Rearranging the latter set of inequalities yields a − b < x < a + b.

(b) Very similar.

22

The absolute value operator ∣⋅∣ is the function with domain R, codomain R+0 , and mapping rule x ↦ x if x ≥ 0 and x ↦ −x if x < 0.

Page 88, Table of Contents

www.EconsPhDTutor.com

Fact 4.

∣a∣ a = ∣ ∣ (provided b ≠ 0). ∣b∣ b

Proof. If a = 0, then clearly this is true.

a a a ≥ 0 so that ∣ ∣ = . Moreover, b b b ∣a∣ a ∣a∣ −a a ∣a∣ a either = or = = . Altogether then, indeed = ∣ ∣. ∣b∣ b ∣b∣ −b b ∣b∣ b If a and b have the same signs (and are non-zero), then

a a a < 0 so that ∣ ∣ = − . Moreover, b b b ∣a∣ −a ∣a∣ a ∣a∣ a either = or = . Altogether then, indeed = ∣ ∣. ∣b∣ b ∣b∣ −b ∣b∣ b If a and b have opposite signs (and are non-zero), then

Page 89, Table of Contents

www.EconsPhDTutor.com

6

Intercepts

Example 79. The graph below is of the equation y = x + 3. It has horizontal intercept −3 and vertical intercept 3. 5

𝑦 𝑦 =𝑥+3

3

1 𝑥

-5

-3

-1

1

3

5

-1

-3

-5

Horizontal intercepts are the x-coordinates of the points at which the graph intersects the horizontal or x-axis. Similarly, vertical intercepts are the y-coordinates of the points at which the graph intersects the vertical or y-axis. Definition 27. a is a horizontal intercept (or x-intercept) of a graph G if (a, 0) ∈ G. Definition 28. b is a vertical intercept (or y-intercept) of a graph G if (0, b) ∈ G.

Page 90, Table of Contents

www.EconsPhDTutor.com

Where the graph G is of an equation (or function), we sometimes also call the horizontal intercepts zeros or roots of the equation (or function). (We’ll use the terms zeros and roots interchangeably in this textbook.) Example 80. The graph below is of the function f ∶ R → R defined by x ↦ x2 − 1. It has vertical intercept −1 and two horizontal intercepts, −1 and 1.

−1 and 1 are also the zeros or roots of f , because f (−1) = 0 and f (1) = 0. 3

𝑓 𝑥 = 𝑥2 − 1

𝑓(𝑥)

2

1

𝑥

0 -2

-1

0

1

2

-1

The A-level exams will often ask you to write down the full co-ordinates of the points at which a graph (or curve) crosses the axes — this means writing down both the x- and y-coordinates, and not just the horizontal intercept or the vertical intercept. Here’s an exercise to help you make this a habit. Exercise 54. Write down in full the point(s) at which the graphs of each the following equations crosses the axes: (a) x2 + y 2 = 1. (b) y = x2 − 4. (c) y = x2 + 2x + 1. (d) y = x2 + 2x + 2. (Answer on p. 947.)

Page 91, Table of Contents

www.EconsPhDTutor.com

7 7.1

Symmetry

Reflection of a Point in a Line

A reflection of a point in a line is its mirror image point on that line. Formally: Definition 29. Let a be a point and l1 be a line. Let l2 be the line that is perpendicular to l1 and runs through a. Let x be the point where l1 and l2 intersect. Then the reflection of a in l1 is the point a′ on l2 such that the distances ax and a′ x are equal.

l1 l2

a

x

a'

Fact 5. Let (a, b) be a point. Its reflection in the line y = x is the point (b, a). Proof. Optional, see p. 850 in the Appendices. Fact 6. Let (a, b) be a point. Its reflection in the line y = −x is the point (−b, −a). Proof. Optional, see p. 850 in the Appendices. Example 81. (a) Given the point (3, 17), its reflection in the line y = x is (17, 3) and its reflection in the line y = −x is (−17, −3). (b) Given the point (−1, 5), its reflection in the line y = x is (5, −1) and its reflection in the line y = −x is (−5, 1).

(c) Given the point (0, 0), its reflection in the line y = x is (0, 0) and its reflection in the line y = −x is (0, 0). Exercise 55. For each of the following points, write down their reflections in the lines (i) y = x; and (ii) y = −x. (a) (3, 17). (b) (−1, 5). (c) (0, 0). (Answer on p. 948.)

Page 92, Table of Contents

www.EconsPhDTutor.com

7.2

Reflection of a Graph in a Line

Definition 30. The reflection of a graph G in a line is the graph G′ where each point in G′ is a reflection of a point in G. Example 82. The reflection of the graph G = {(x, y) ∶ y = x2 + 4} in the line y = 2 is the graph G′ = {(x, y) ∶ y = −x2 }. 𝑦

𝐺: 𝑦 = 𝑥 2 + 4 𝑦=2 line of reflection

𝑥

𝐺 ′ : 𝑦 = −𝑥 2

Page 93, Table of Contents

www.EconsPhDTutor.com

Example 83. The reflection of the graph G = {(x, y) ∶ y = ln x} in the line x = 0 is the graph G′ = {(x, y) ∶ y = ln(−x)}. 𝑦

𝑥=0 line of reflection

𝐺 ′ : 𝑦 = ln −𝑥

𝐺: 𝑦 = ln 𝑥 𝑥

Fact 7 formalises our earlier observation in section 3.7 (Inverse Functions) that the graphs of f and its inverse f −1 are reflections in the line y = x.

Fact 7. Let f be an invertible function. Then the reflection of the graph of f in the line y = x is the graph of its inverse function f −1 . Proof. Optional, see p. 851 in Appendices.

Page 94, Table of Contents

www.EconsPhDTutor.com

The next Fact simply makes the obvious observation that the reflection in the line y = x of any point along the line y = x is itself. Fact 8. Let (a, a) be a point. Its reflection in the line y = x is (a, a).

The above two Facts together imply that Fact 9. Let f be invertible. Suppose f passes through (a, a). Then so too does its inverse f −1 . And hence, f and f −1 intersect at those points where x = f (x). The above Fact is useful for finding the intersection points of a function and its inverse. Example 84. Let f ∶ R → R be the invertible function defined by x ↦ 2x. The graph of f intersects the graph of f −1 at the point(s) where x = f (x) ⇐⇒ x = 2x ⇐⇒ x = 0. Notice the intersection point (0, 0) is also on the line y = x. See figure on p. 71.

Example 85. Let f ∶ R+0 → R be the invertible function defined by x ↦ x2 . The graph of f intersects the graph of f −1 at the point(s) where x = f (x) ⇐⇒ x = x2 ⇐⇒ x(x − 1) = 0 ⇐⇒ x = 0, 1. Notice the intersection points (0, 0) and (1, 1) are also on the line y = x. See figure on p. 73. Be careful not to make the mistake of believing that f and f −1 can only intersect at points where x = f (x). A function and its inverse can certainly intersect at points that are not on the y = x line. Example 86. Let f ∶ R+ ∪ R− → R be the invertible function defined by x ↦ 1/x. The graph of f intersects the graph of f −1 at the point(s) where x = f (x) Ô⇒ x = 1/x Ô⇒ x = −1, 1.

We have merely found two points at which f and f −1 intersect. There may very well be other intersection points. Indeed, in this example, f and f −1 also intersect at every other x ≠ 0! See figure on p. 72.

Page 95, Table of Contents

www.EconsPhDTutor.com

7.3

Lines of Symmetry

Definition 31. A graph is symmetric in a line if it is unchanged after being reflected in that line. Example 87. The graph of y = x2 is symmetric in the line x = 0 (which also happens to be the vertical axis). 4

4

𝑦

𝑦

= x2 𝑥=0 Reflection line 𝑦 = 𝑥2

3

3 𝑥= Line of symmetry 𝑦 = 𝑥2

2

2 1

x 2 1

0

-2 4

-1

0

𝑦 𝑥 0 -2

Page 96, Table of Contents

-1

0

1

2

www.EconsPhDTutor.com

Example 88. The graph of y =

1 is symmetric in the lines y = x and y = −x. x 5 𝑦

𝑦 = −𝑥 line

𝑦=𝑥 line

4

3 2 1

𝑦 = 1/𝑥

0 -5

-4

-3

-2

-1

0

-1

1

2

3

4

5

𝑥

-2 -3 -4

-5

Page 97, Table of Contents

www.EconsPhDTutor.com

8

Limits, Continuity, and Asymptotes

The syllabus makes nearly no mention of limits and none of continuity. Yet differentiation and integration are built entirely on the concept of limits. Continuity is also almost always assumed. It is thus well-worth spending an hour or two on these concepts, especially since they’re not difficult and everything will become that much clearer.

8.1

Limits: Introduction and Examples

Here is a very simple example to illustrate the idea of limits. Example 89. Graphed below is the function f ∶ R → R defined by x ↦ 5x + 2. Observe that “as x approaches 3, f (x) approaches 17”. We write this as: Statement #1. As x → 3, f (x) → 17.

(The right arrow symbol “→” means to in the context of functions, but now means approaches in the context of limits.) Equivalently, we may say “The limit of f (x) as x approaches 3 is equal to 17.” We write: Statement #2. lim f (x) = 17. x→3

Statements #1 and #2 are entirely equivalent. Either may be (informally) interpreted thus:

For all values of x that are close to but not equal to 3, f (x) is close to (or possibly even equal to) 17. y

x -5

Page 98, Table of Contents

-3

-1

1

3

5

www.EconsPhDTutor.com

In general, lim f (x) = L is informally interpreted as: x→a

For all values of x that are close to but not equal to a, f (x) is close to (or possibly even equal to) L.

This interpretation is informal because the phrase “close to” is vague. For the formal definitions of limits (optional), see Section 84.1 in the Appendices. The subtle condition “but not equal to” requires emphasis. When considering the limit of f at 3, we do NOT care about the value f (3). Indeed, we do NOT care even if f (3) is undefined! Here’s an example where lim g(x) is well-defined, even though g(3) is not. x→3

Example 90. Graphed below is the function g ∶ (−∞, 3)∪(3, ∞) → R defined by x ↦ 5x+2. It looks almost exactly like that of f (from the previous example), except there is now a “hole” (or more formally, a discontinuity) at x = 3. Nonetheless, it is still true that

For all values of x that are close to but not equal to 3, g(x) is close to (or possibly even equal to) 17. In formal notation, we write “as x → 3, g(x) → 17” or “lim g(x) = 17”. x→3

y

x -5

Page 99, Table of Contents

-3

-1

1

3

5

www.EconsPhDTutor.com

In the next example, both h(3) and lim h(x) are well-defined, but lim h(x) ≠ h(3). x→3

x→3

Example 91. Graphed below is the function h ∶ R → R defined by x ↦ 5x + 2 for x ≠ 3 and h(3) = 0. The graph of h looks almost exactly like those of f and g (from the previous examples), except that now the value of h at x = 3 is, strangely enough, 0. Nonetheless, it is still true that

For all values of x that are close to but not equal to 3, h(x) is close to (or possibly even equal to) 17. In formal notation, we write “as x → 3, h(x) → 17” or “lim h(x) = 17”. x→3

y

x -5

-3

-1

1

3

5

The next example is similar.

Page 100, Table of Contents

www.EconsPhDTutor.com

Example 92. Graphed below is the function i ∶ R → R defined by x ↦ 0 for x ≠ 3 and i(3) = 17. This graph looks very different from those of f , g, and h (from the previous examples). But like f , we again have i(3) = 17.

We are tempted to conclude that therefore lim i(x) = 17. This though is wrong, because x→3

we cannot make i(x) as close to 17 as we like by restricting x to values that are close to but not equal to 3. Hence, As x → 3, i(x) → / 17,

or equivalently, lim i(x) ≠ 17. x→3

Instead, lim i(x) = 0, because: x→3

For all values of x that are close to but not equal to 3, i(x) is close to (or possibly even equal to) 0. y

x -5

-3

-1

1

3

5

Section 8.3 gives more examples of limits. But first, let’s learn about continuity.

Page 101, Table of Contents

www.EconsPhDTutor.com

8.2

Continuity

Informally, a function is continuous at a point a if there is no “hole” or “jump” at a. And so a function is continuous on an interval (of points) if you can smoothly draw its graph for that entire interval without once lifting your pencil. Formally: Definition 32. f ∶ D → R is continuous at a ∈ D if lim f (x) = f (a). x→a x→a

Section 84.6 in the Appendices contains additional definitions and results concerning continuity (optional).

Example 89 (revisited). Graphed below is the function f ∶ R → R defined by x ↦ 5x + 2. It is continuous at 3, because lim f (x) = 17 and f (3) = 17. x→3

It is also continuous at 1, because lim f (x) = 7 and f (1) = 7. x→1

Indeed, it is continuous on R, because for any a ∈ R, we have lim f (x) = f (a). x→a

Page 102, Table of Contents

x→a

www.EconsPhDTutor.com

Example 90 (revisited). Graphed below is the function g ∶ (−∞, 3) ∪ (3, ∞) → R defined by x ↦ 5x + 2. It is continuous at 1, because lim g(1) = 7 and g(1) = 7. x→5

However, it is not continuous at 3, because lim g(x) = 17, but g(3) is undefined, and so x→3 lim g(x) ≠ g(3). x→3

Altogether, g is continuous at any a ∈ (−∞, 3)∪(3, ∞), because for any a ∈ (−∞, 3)∪(3, ∞), we have lim g(x) = g(a). But g fails to be continuous at 3. x→a x→a

Example 91 (revisited). Graphed below is the function h ∶ R → R defined by x ↦ 5x + 2 for x ≠ 3 and h(3) = 0. It is continuous at 1, because lim h(x) = 7 and h(1) = 7. x→1

But it is not continuous at 3, because lim h(x) = 17, but h(3) = 0 and so lim h(x) ≠ h(3). x→3

x→3

Altogether, h is continuous at any a ∈ (−∞, 3)∪(3, ∞), because for any a ∈ (−∞, 3)∪(3, ∞), we have lim h(x) = h(a). But h fails to be continuous at 3. x→a

Page 103, Table of Contents

www.EconsPhDTutor.com

8.3

Limits: More Examples

We now turn to examples where limits do not exist. We start with a trivial example. Example 93. Graphed below is the function f ∶ R+0 → R defined by x ↦ 5x + 2.

There is no number L such that for all values of x that are “close to” but not equal to −3, f (x) is also “close to” L. And so we simply say that lim f (x) does not exist. x→−3

This is a trivial example because −3 is “far from” the domain of f . So obviously, for all values of x that are “close to” but not equal to −3, f (x) is undefined and so of course there is no number L that f (x) is always “close to”!

y

x -5

-3

-1

1

3

5

is undefined.

The next example is less trivial.

Page 104, Table of Contents

www.EconsPhDTutor.com

1 x for all x ≠ 0. This is a very strange function. As x gets ever “closer to” 0, g(x) fluctuates ever more rapidly between −1 and 1.

Example 94. Graphed below is the function g ∶ R → R defined by g(0) = 0 and g(x) = sin y

x

It’s difficult or even impossible to draw an accurate graph of g near the origin. In this example, lim g(x) does not exist. The reason is that for all values of x that are x→0

“close to” but not equal to 0, there is no number L that g(x) is “close to”. When x is “close to” 0, g(x) takes on every value in [−1, 1] infinitely often! And so g(x) can never be said to be “close to” any one single number L.

Altogether then, g is not continuous at 0. (With a little work, we can actually prove that g is continuous on R− and also on R+ , but this is beyond the scope of the A-Levels.) In the next example, h is nowhere-continuous!

Page 105, Table of Contents

www.EconsPhDTutor.com

Example 95. Graphed below is the function h ∶ R → R defined by x ↦ 1 if the decimal form representation of x contains the digit 7 and x ↦ 2 otherwise.

This function is arguably even stranger than the previous one. We have for example, h(7) = h(70) = h(1.27) = h(0.0007) = 1 and h(15) = h(16) = h(16.335) = 2.

There are infinitely many points along the line y = 1. And there are also infinitely many points along the line y = 2! It is quite impossible to sketch its graph accurately.

Nonetheless, h is a perfectly well-defined function. Indeed, h(3) is well-defined and h(x) is well-defined for any x ∈ R.

However, lim h(x) does not exist. However we try to restrict x to values that are close to x→3

(but not equal to) 3, h(x) is never close to any one single value; instead, h(x) switches infinitely often between 1 and 2. Indeed, lim h(x) does not exist for any a ∈ R! However we try to restrict x to values that x→a are close to (but not equal to) a, h(x) is never close to any one single value; instead, h(x) switches infinitely often between 1 and 2. h is nowhere-continuous: For every a ∈ R, h(a) is perfectly well-defined, lim h(x) is not. x→a And so for every a ∈ R, lim h(x) ≠ h(a). x→a

y

2

1 is nowherecontinuous. x

⎧ ⎪ ⎪ ⎪1, Exercise 56. Consider the function f ∶ R → R defined by f (x) = ⎨ ⎪ ⎪ ⎪ ⎩2, are lim f (x), lim f (x), and lim f (x)? (Answer on p. 949.) x→−5

x→0

Page 106, Table of Contents

x→5

if x ≤ 0,

if x > 0.

What

www.EconsPhDTutor.com

8.4

Infinite Limits and Vertical Asymptotes

This section considers infinite limits, i.e. where as x approaches some number, f (x) increases (or decreases) grows without bound . Example 96. Graphed below are the functions f and g, both with domain (−∞, 3)∪(3, ∞) 1 1 and g ∶ x ↦ −2 − . and codomain R, defined by f ∶ x ↦ 2 + (x − 3)2 (x − 3)2

y

vertical asymptote -2

-1

0

1

2

x 3

4

5

6

7

8

Observe that for all values of x that are “close to” but not equal to 3, there is no number L that f (x) is “close to”. Hence, we say that lim f (x) simply does not exist. Similarly, x→3

lim g(x) does not exist either. x→3

Nonetheless, we observe that as x → 3, f (x) increases without bound, while g(x) decreases without bound. By a very special convention, we are allowed to write these observations as: lim f (x) = ∞ and x→3

lim g(x) = −∞. x→3

We say that x = 3 is a vertical asymptote of both the graph of f and that of g.

“lim f (x) = ∞” must NOT be interpreted to mean that there exists something called “lim f (x)” (no such thing exists); or that this thing is equal to some other thing called “∞” x→3

(recall that ∞ is not a number!). Instead, “lim f (x) = ∞” is interpreted informally as: x→3

x→3

f (x) can be made as large as we like, for all values of x that are sufficiently “close to” but not equal to 3.

Again, see Section 84.3 in the Appendices (optional) for the formal definitions. Page 107, Table of Contents

www.EconsPhDTutor.com

Here is another example of vertical asymptotes. Example 97. Graphed below is the equation y = tan x. It has two vertical asymptotes x = ±π/2, because lim = −∞ and lim = ∞. x→−π/2

x→π/2

15

Vertical asymptote x = - π /2

y

10

5

y = tan x x

0 π/2

π/2 -5

Vertical asymptote x = π /2 -10

-15

Page 108, Table of Contents

www.EconsPhDTutor.com

8.5

Limits at Infinity, Horizontal and Oblique Asymptotes

This section considers limits at infinity (not to be confused with the infinite limits discussed in the previous section). That is, the behaviour of f (x) as x increases (or decreases) grows without bound. Example 96 (revisited). Reproduced below are the graphs of the functions f and g, both with domain (−∞, 3) ∪ (3, ∞) and codomain R, defined by f ∶x↦2+

1 (x − 3)2

and g ∶ x ↦ −2 −

1 . (x − 3)2

We already saw that f and g both have vertical asymptote x = 3, because as x → ∞, f (x) increases without bound and g(x) decreases without bound. We now consider instead what happens as x increases or decreases without bound.

y

horizontal asymptotes

x

As x increases without bound, f (x) → 2 and g(x) → −2. And as x decreases without bound, f (x) → 2 and g(x) → −2. We can write these observations as lim f (x) = 2, x→∞ lim g(x) = −2, lim f (x) = 2, and lim g(x) = −2.

x→∞

x→−∞

x→−∞

We also say that y = 2 is a horizontal asymptote of the graph of f . Similarly, y = −2 is a horizontal asymptote of the graph of g. [See Section 84.4 in the Appendices (optional) for the formal definition of a horizontal asymptote.] Pedantic point: Infinite limits do not exist. In contrast, limits at infinity DO exist. Here in this example, lim f (x) does not exist. In contrast, lim f (x) and lim f (x) both exist x→∞ x→−∞ x→3 (and are both equal to 2).

Page 109, Table of Contents

www.EconsPhDTutor.com

Example 98. Graphed below is the equation y = ex .

As x → −∞, y → 0. We can also write this as lim y = 0. And we can also say that this x→−∞ graph has horizontal asymptote y = 0.

20

y

y = ex

15

10

5

Horizontal asymptote x y=0 0 -4

Page 110, Table of Contents

-2

0

2

4

www.EconsPhDTutor.com

The previous two examples were of horizontal asymptotes. Next is an example of an oblique (or slant) asymptote. 1 Example 99. Consider the function f ∶ R− ∪ R+ → R defined by x ↦ x + . x

As x increases without bound or decreases without bound, f (x) approaches the line y = x. We can also write these observations as lim f (x) = x and lim f (x) = x. x→∞

x→−∞

We can moreover say that y = x is an oblique asymptote of the graph of g.

Again, see Section 84.4 in the Appendices (optional) for the formal definition of an oblique asymptote.

5

y

3

1 x -5

-3

-1

1

3

5

-1 Oblique asymptote y=x -3

-5

Page 111, Table of Contents

www.EconsPhDTutor.com

9 Differentiation 9 Differentiation 9.1 Motivation: The Derivative as Slope of the Tangent 9.1 Motivation: The Derivative as Slope of the Tangent The problem of finding the derivative is the problem of finding the slope of the tangent to problem of finding a The graph at a given point. the derivative is the problem of finding the slope of the tangent to a graph at a given point. Graphed below is some function f ∶ R → R. Pick some point A = (a, f (a)). Draw the line l Graphed below to is some function f ∶R → R.A.Pick some point A = (a, f (a)). Draw the line l which is tangent the graph at the point which is tangent to the graph at the point A. How do we find the slope of l? Unsure of how to proceed, we try a crude approximation. How do we find the slope of l? Unsure of how to proceed, we try a crude approximation. Pick some point X1 = (x1 , f (x1 )) that is also on the graph. Consider the line AX1 . What’s Pick some point X1 = (x1 , f (x1 )) that is also on the graph. Consider the line AX1 . What’s f (x1 ) − f (a) f (x1 ) − f (a). itsitsslope? Slope = Rise ÷ Run and so AX has slope 1 slope? Slope = Rise ÷ Run and so AX1 has slope . xx1 −− aa 1 This Thisnumber numberserves servesasasour ourfirst firstcrude crudeapproximation approximation of of the the slope slope of of l. l.

How can X22 == (x (x22,,ff (x (x22)) )) How canweweimprove improveononthis thisapproximation? approximation? Simple Simple— —just just pick pick some some point point X ff(x (x22))−−ff(a) (a). that . thatis iscloser closertotoA.A.The Theline lineAX AX2 2has hasslope slope xx22−−aa This Thisnumber numberserves servesasasour oursecond, second,improved improvedapproximation approximation of of the the slope of l.

AtAtleast are ever ever leastinintheory, theory,we wecan cankeep keeprepeating repeating this this procedure, procedure, by by picking picking points that are closer then, we we are are closertotoA.A.Our Ourestimates estimatesofofthe theslope slopeofof ll will will get get ever ever better. better. Altogether then, motivatedtotomake makethe thefollowing followingformal formaldefinition definition of of the the derivative: derivative: motivated Page 112, Table Contents Page 112, Table ofofContents

www.EconsPhDTutor.com www.EconsPhDTutor.com

Definition 33. Let f ∶ D → R be a function. Consider f (x) − f (a) . x→a x−a lim

If this limit exists, then we say that f is differentiable at the point a ∈ D and we call this limit the value of f ’s derivative at the point a ∈ D. But if this limit does not exist, then we say that f is not differentiable at the point a ∈ D and the value of f ’s derivative at the point a ∈ D is undefined or does not exist. Here’s a simple example to illustrate.

Page 113, Table of Contents

www.EconsPhDTutor.com

Example 100. Graphed below is the function f ∶ R → R defined by x ↦ ∣x∣.

y

Derivative = -1 for a < 0

Derivative = 1 for a > 0 Derivative does not exist at a = 0.

x

The value of f ’s derivative at the point x = −5 is

f (x) − f (−5) ∣x∣ − ∣−5∣ −x − 5 = lim = lim = lim −1 = −1. x→−5 x→−5 x + 5 x→−5 x + 5 x→−5 x − (−5) lim

Similarly, the value of f ’s derivative at the point x = −3 is also

f (x) − f (−3) ∣x∣ − ∣−3∣ −x − 3 = lim = lim = lim −1 = −1. x→−3 x→−5 x + 3 x→−3 x + 3 x→−3 x − (−3) lim

Indeed, the value of f ’s derivative at any point a < 0 is −1, because for any a < 0, f (x) − f (a) ∣x∣ − ∣a∣ −x + a = lim = lim = lim −1 = −1. x→a x→a x − a x→a x − a x→a x−a lim

In contrast, for any a > 0, we have

f (x) − f (a) ∣x∣ − ∣a∣ x−a = lim = lim = lim 1 = 1. x→a x − a x→a x − a x→a x→a x−a lim

At x = 0, f is not differentiable, as we now prove:

⎧ x ⎪ ⎪ lim = lim 1 = 1, ⎪ ⎪ x→0 x x→0 ⎪ ⎪ f (x) − f (0) ∣x∣ − ∣0∣ ∣x∣ ⎪ lim = lim = lim =⎨ x→0 x→0 x − 0 x→0 x ⎪ x−0 ⎪ ⎪ −x ⎪ ⎪ ⎪ lim = lim (−1) = −1, ⎪ ⎩x→0 x x→0

So as x → 0, there is no one single value towards which the expression proaches. So the limit does not exist.

Page 114, Table of Contents

for x > 0,

for x < 0.

f (x) − f (0) apx−0

www.EconsPhDTutor.com

9.2

Lagrange’s, Leibniz’s, and Newton’s Notation The Value of the Derivative of f at a

Lagrange’s notation: Leibniz’s notation:

Newton’s notation: Some remarks.

f (x) − f (a) . x→a x−a

f ′ (a) = lim

R f (x) − f (a) df (x) RRRR RRR = lim x→a dx RR x−a Rx=a ⋅

f (x) − f (a) . x→a x−a

f (a) = lim

or

R f (x) − f (a) df RRRR RRR = lim . x→a dx RR x−a Rx=a

• Lagrange’s and Leibniz’s notation are widely-used. Newton’s notation is not. But Newton’s notation is sometimes used in physics (especially when the independent variable is time). You certainly need to know about Newton’s notation because it is on the A-level syllabus. Nonetheless, this textbook will avoid using Newton’s notation.

d • Leibniz’s notation is convenient in that it allows us to interpret as the “differentiate dx with respect to” operator. Section 9.5 will give some examples of how this operator works.

• Here is the motivation behind Leibniz’s notation. Define ∆x to be equal to x − a, and ∆f (x) to be equal to f (x) − f (a), so that we can write: ∆f (x) f (x) − f (a) = . ∆x x−a

The limit of this expression as x → a is precisely the value of the derivative of f at a: R f (x) − f (a) ∆f (x) df RRRR lim = RRR = lim . x→a ∆x x→a dx RR x−a Rx=a

Page 115, Table of Contents

www.EconsPhDTutor.com

Example 100 (revisited). Consider again the function f ∶ R → R defined by x ↦ ∣x∣. a = −5

Lagrange’s notation: f ′ (−5) = −1, Leibniz’s notation:

Newton’s notation:

R df RRRR = −1, R dx RRRR Rx=−5 ⋅

f (−5) = −1,

a=2

f ′ (2) = 1,

R df RRRR R = 1, dx RRRR Rx=2 ⋅

f (2) = 1,

a=0

f ′ (0) is undefined,

R df RRRR is undefined, R dx RRRR Rx=0 ⋅

f (0) is undefined.

Historical Note (optional). Here is a very oversimplified history of Leibniz’s notation, to give you a better sense of why we use it. Leibniz (1646-1716) thought of dx as an “infinitesimal change in x”. And dy was the corresponding “infinitesimal change in y”. Leibniz then defined the derivative to be literally the quotient dy/dx. Unfortunately, the idea of “infinitesimals” was rather vague, imprecise, and non-rigorous. So in the 19th century, mathematicians embarked on a project to put calculus on a firmer footing. In particular, they wished to rid mathematics of all references to “infinitesimals”. Eventually, they settled on the modern notion of limits, in which no reference to “infinitesimals” was necessary. This modern notion of limits is also what you’ve just learnt. So simply put, Leibniz was wrong to think of the derivative as a fraction. And you should be very careful not to think of the derivative as a fraction, even though it looks very much like one. You are now being taught things in the correct order. First you are taught about limits. Next we define the derivative in terms of limits. We are careful to note that the derivative is not a fraction. But if Leibniz was wrong to think of the derivative as a fraction, then why are we still using his notation? The main reason is that it is highly intuitive. In particular, it reminds us of what calculus is really about — how a small change in one variable affects another variable. It also allows us to quickly grasp the intuition behind such results as the Chain Rule, which may informally be stated as: dz dz dy = . dx dy dx

It is tempting to naïvely interpret the expressions in the above equation as fractions, naïvely apply simple algebra, naïvely cancel out the dy’s, so that the equation is indeed true. But Page 116, Table of Contents

www.EconsPhDTutor.com

the correct informal interpretation (easily seen when written in Leibniz’s notation) is this: “The change in z caused by a small unit change in x” is equal to “The change in z caused by a small unit change in y” × “The change in y caused by a small unit change in x”. Another result is the Inverse Function Theorem, which may informally be stated as: dy 1 = . dx dx dy dy dx and as fractions, so that indeed by naïve dx dy algebra, the above equation is true. But again, the correct informal interpretation (easily seen when written in Leibniz’s notation) is this: “The change in y caused by a small unit change in x” is equal to “The reciprocal of the change in x caused by a small unit change in y”. Again, the naïve interpretation would be of

For a more detailed discussion, see the leading answer to this question on Math StackExchange.

Page 117, Table of Contents

www.EconsPhDTutor.com

9.3

The Derivative is a Function

Above we defined the value of the derivative at a given point to be a number. In contrast, we now define the derivative to be a function: Definition 34. Let f ∶ D → R be a function and A be the set of points at which f is differentiable. Then the derivative of f is the function with domain A, the same codomain as f (namely R), and mapping rule x ↦ f ′ (x). The Derivative of f Lagrange’s notation:

f ′.

Leibniz’s notation:

df (x) dx

Newton’s notation:

f.

or

df . dx

⋅

For the next example, I assume you already know that

d d 2 cx = 2cx and cx = c. dx dx

Example 101. Let f ∶ R → R be defined by f (x) = 7x2 . Its derivative is the function f ′ ∶ R → R defined by f ′ (x) = 14x. This derivative may be denoted f ′ or

⋅ df (x) df or or f . dx dx

⋅ df (x) df ∣ = ∣ = f (0.5) = 7. dx x=0.5 dx x=0.5 ⋅ df (x) df The value of the derivative of f at 1 is f ′ (1) = ∣ = ∣ = f (1) = 14. dx x=1 dx x=1 ⋅ df (x) df The value of the derivative of f at 2 is f ′ (2) = ∣ = ∣ = f (2) = 28. dx x=2 dx x=2

The value of the derivative of f at 0.5 is f ′ (0.5) =

Page 118, Table of Contents

www.EconsPhDTutor.com

9.4

Second and Higher-Order Derivatives

The derivative is also known as the first derivative. The second derivative is, similarly, also a function: Definition 35. Let f ∶ D → R be a function. The second derivative of f is simply the derivative of the derivative of f . The Derivative of f Lagrange’s notation:

f ′′ .

Leibniz’s notation:

d2 f (x) dx2

Newton’s notation:

f.

Under Leibniz’s notation, since d2 d2 f derivative of f by 2 f or 2 . dx dx

or

d2 f . dx2

⋅⋅

d is the operator, it makes sense to denote the second dx

Example 101 (revisited). Let f ∶ R → R be defined by x ↦ 7x2 . Its second derivative is the function with domain and codomain both R, and mapping rule x ↦ 14. This second derivative may be denoted f ′′ or

⋅⋅ d2 f (x) d2 f or or f . dx2 dx2

⋅ df (x) df ∣ = ∣ = f (0.5) = 14. dx x=0.5 dx x=0.5 ⋅ df (x) df The value of the second derivative of f at 1 is f ′ (1) = ∣ = ∣ = f (1) = 14. dx x=1 dx x=1 ⋅ df (x) df The value of the second derivative of f at 2 is f ′ (2) = ∣ = ∣ = f (2) = 14. dx x=2 dx x=2

The value of the second derivative of f at 0.5 is f ′ (0.5) =

Page 119, Table of Contents

www.EconsPhDTutor.com

We similarly define the third, fourth, fifth, etc. derivatives in the “obvious” fashion. Definition 36. Let f ∶ D → R be a function. For n ≥ 3, the nth derivative of f is simply the derivative of the (n − 1)th derivative of f . The 3rd The 4th Derivative of f Derivative of f Lagrange’s notation:

f (3) .

f (4) .

Leibniz’s notation:

d3 f dx3 .

d4 f dx4

Newton’s notation:

3 ⋅

4 ⋅

f.

Etc.

f.

Example 101 (revisited). Let f ∶ R → R be defined by x ↦ 7x2 . Its first derivative is the function f ′ ∶ R → R defined by x ↦ 14x. Its second derivative is the function f ′′ ∶ R → R defined by x ↦ 14. We have f ′ (2) = 28 and f ′′ (2) = 14.

Its third derivative is the function f (3) ∶ R → R defined by x ↦ 0. Its fourth derivative is the function f (4) ∶ R → R defined by x ↦ 0. Observe that f (3) = f (4) . Indeed, the third and all higher-order derivatives are identical functions: f (3) = f (4) = f (5) = . . .

We have f (3) (2) = f (4) (2) = f (5) (2) = ⋅ ⋅ ⋅ = 0. Indeed, for any x ∈ R, we have f (3) (x) = f (4) (x) = f (5) (x) = ⋅ ⋅ ⋅ = 0. Exercise 57. Given f ∶ D → R and f ′ ∶ A → R, what is f ′′ ? (Answer on p. 950.)

Exercise 58. (Tedious but easy.) Let g ∶ R → R be defined by x ↦ x4 −x3 +x2 −x+1. Write down all of its derivatives. Evaluate all of these derivatives at 1. Write your answers in Lagrange’s, Leibniz’s, and Newton’s notation. (Answer on p. 950.)

Page 120, Table of Contents

www.EconsPhDTutor.com

9.5

More About Leibniz’s Notation: The

Example 102. “

d Operator dx

d 2 x = 2x” is simply shorthand for this statement: dx

The derivative of the function with mapping rule x ↦ x2 is the function with mapping rule x ↦ 2x. Example 103. “

d f = g” is simply shorthand for this statement: dx

The derivative of the function f is the function g.

Example 104. “

d f ⋅ g = g ⋅ f ′ + f ⋅ g ′ ” is simply shorthand for this statement: dx

The derivative of the function f ⋅ g is the function with mapping rule x ↦ g(x) ⋅ f ′ (x) + f (x) ⋅ g ′ (x).

Page 121, Table of Contents

www.EconsPhDTutor.com

9.6

Standard Rules of Differentiation

Proposition 1. Let f ∶ A → R and g ∶ B → R be differentiable functions with derivatives f ′ and g ′ . Suppose also that the composite function f g ∶ A → R is well-defined. Let k ∈ R be a constant. Then: d dx

k

=

0,

=

kf ′ ,

=

ex ,

d sin x = dx

d f ± g = f ′ ± g′, dx d dx

kf

d dx

xk

d dx

ex

d dx

ln x

d cos x = dx d dx

= kxk−1 ,

=

d dx

f ⋅g f g

= =

cos x,

− sin x,

g ⋅ f ′ + f ⋅ g′,

g ⋅ f ′ − f ⋅ g′ , g⋅g

d d (f ○ g) dg f ○g = ⋅ . dx dg dx

1 , x

(My mnemonic for the Quotient Rule is: “Lo-D-Hi minus Hi-D-Lo; cross over and square the low.”) Proof. Optional, see p. 883 in the Appendices.

Of the above rules, the Chain Rule is the most powerful. We can also write it more elegantly (if a little imprecisely) as dz dz dy = ⋅ . dx dy dx

As discussed above in the historical note (p. 116), thus written, the Chain Rule has a beautiful informal interpretation: “The change in z caused by a small unit change in x” is equal to “The change in z caused by a small unit change in y” × “The change in y caused by a small unit change in x”. This makes perfect sense:

Page 122, Table of Contents

www.EconsPhDTutor.com

Example 105. When I add 1 g of Milo (the x-variable) to a cup of water, the volume of the water increases by 2 cm3 (the y-variable). That is, dy/dx = 2 cm3 g-1 .

When the volume of the water increases by 1 cm3 (the y-variable), the water level (in the cup) rises by 0.3 cm (the z-variable). That is dz/dy = 0.3 cm cm-3 = 0.3 cm-2 .

Altogether then, when I add 1 g of Milo (the x-variable) to a cup of water, I’d expect the water level to rise by 0.6 cm. That is, dz/dx = 0.6 cm g-1 . This is indeed consistent with dz dz dy = = 2 × 0.3 = 0.6 cm g−1 . dx dy dx

In case you’ve forgotten how it works, here are a few examples to illustrate: Example 106. Let h ∶ R → R be defined by x ↦ esin x . h′ (x) =

desin x desin x dsin x = = esin x cos x. dx dsin x dx

Example 107. Let g ∶ R → R be defined by x ↦

√

4x − 1.

√ √ d d 4x − 1 4x − 1 d(4x − 1) −0.5 −0.5 g ′ (x) = = = 0.5 (4x − 1) ⋅ 4 = 2 (4x − 1) . dx d(4x − 1) dx

Here’s a more complicated example, where the Chain Rule is applied twice.

Example 108. Let f ∶ R → R be defined by x ↦ [sin(2x − 3) + cos(5 − 2x)] . Then 3

d [sin(2x − 3) + cos(5 − 2x)] f (x) = dx 3 d [sin(2x − 3) + cos(5 − 2x)] d[sin(2x − 3) + cos(5 − 2x)] = d[sin(2x − 3) + cos(5 − 2x)] dx d cos(5 − 2x) d(5 − 2x) 2 d sin(2x − 3) d(2x − 3) = 3 [sin(2x − 3) + cos(5 − 2x)] [ + ] d(2x − 3) dx d(5 − 2x) dx 3

′

= 3 [sin(2x − 3) + cos(5 − 2x)] [cos(2x − 3) ⋅ 2 − sin(5 − 2x) ⋅ (−2)] 2

= 6 [sin(2x − 3) + cos(5 − 2x)] [cos(2x − 3) + sin(5 − 2x)] . 2

Page 123, Table of Contents

www.EconsPhDTutor.com

Exercise 59. For each of the following functions (assume they have a suitably defined domain and codomain), evaluate the first derivative at 0. (a) f (x) = x2 . (b) g(x) = x 2 1 + [x − ln (x + 1)] . (c) h(x) = sin 2 . (Answer on p. 951.) 1 + [x − ln (x + 1)] Corollary 1.

d d d tan x = sec2 x, cot x = − csc2 x, and csc x = − csc x cot x. dx dx dx

Proof. Using the quotient rule,

d d sin x cos x cos x − sin x(− sin x) cos2 x + sin2 x 1 tan x = = = = = sec2 x. 2 2 dx dx cos x cos x cos x cos2 x

For the derivatives of cot x and csc x, see Exercise 60.

SYLLABUS ALERT d csc x = − csc x cot x is in the List of Formulae for 9758 (revised), but not for 9740 (old). dx

Exercise 60. Prove the following: (Answer on p. 951.) d cot x = − csc2 x and dx

d csc x = − csc x cot x. dx

Exercise 61. (Answer on p. 951.) (a) Newton’s Second Law of Motion is that force is equal to the rate of change of momentum, where momentum is the product of mass and velocity. Write down this law in mathematical notation, with F , m, v, and t denoting force, mass, velocity, and time. (b) Assume that mass is constant. Explain why Newton’s Second Law then simplifies into the more-familiar F = ma, where a is acceleration (i.e. the rate of change of velocity).

Page 124, Table of Contents

www.EconsPhDTutor.com

9.7

Differentiable and Twice-Differentiable Functions

Definition 37. A function is differentiable on a set of points if it is differentiable at every point in that set. Definition 38. A function is differentiable if it is differentiable on its domain. Definition 39. A function is twice-differentiable on a set of points if it is twice-differentiable at every point in that set. Definition 40. A function is twice-differentiable if it is twice-differentiable on its domain.

In other words, f is differentiable if and only if f ′ has the same domain as f . Similarly, f is twice-differentiable if and only if f ′′ has the same domain as f . And of course, if a function is twice-differentiable, then it is also differentiable.

(The definitions for a function to be thrice-differentiable, four-times-differentiable, etc. are very much analogous, but this textbook will have no reason to use these terms.) The condition that the first derivative (or second derivative) exists at every point in the domain is important. Failing which, we do not consider the function to be differentiable (or twice-differentiable). The three functions in the next example illustrate:

Page 125, Table of Contents

www.EconsPhDTutor.com

Example 109. Consider f ∶ R → R defined by f (x) = x2 . We have f ′ (x) = 2x and f ′′ (x) = 2 for all x ∈ R. And so f is both differentiable and twice-differentiable.

Now consider g ∶ R → R defined by g(x) = x ∣x∣ (graphed below). We have g ′ (x) = 2 ∣x∣ for all x ∈ R and ⎧ ⎪ ⎪ ⎪−2, ′′ g (x) = ⎨ ⎪ ⎪ ⎪ ⎩2,

for x < 0, for x > 0.

But g ′′ (0) does not exist. And so g is differentiable but NOT twice-differentiable.

y

, for all . x - 2, for x < 0, 2, for x > 0. is undefined.

Consider h ∶ R → R defined by x ↦ ∣x∣. We have ⎧ ⎪ ⎪ ⎪−2, ′ h (x) = ⎨ ⎪ ⎪2, ⎪ ⎩

for x < 0, for x > 0.

But h′ (0) does not exist. So h is not even once-differentiable. (And thus it is certainly not twice-differentiable either.)

Page 126, Table of Contents

www.EconsPhDTutor.com

We can of course also consider thrice-differentiable, four-times-differentiable, etc. functions. We can even consider infinitely-differentiable functions. Indeed, in the A-levels, most functions are usually infinitely differentiable. For example, all polynomials are infinitelydifferentiable, as illustrated in the next example. Example 110. Consider i ∶ R → R defined by x ↦ x5 − x4 + x3 − x2 + x − 1. We have, for all x ∈ R, i′ (x) = 5x4 − 4x3 + 3x2 − 2x + 1,

i′′ (x) = 20x3 − 12x2 + 6x − 2,

i(3) (x) = 60x2 − 24x + 6,

i(4) (x) = 120x − 24, i(5) (x) = 120,

i(6) (x) = i(7) (x) = i(8) (x) ⋅ ⋅ ⋅ = 0.

The function i is infinitely-differentiable, with the 6th and higher-order derivatives all having the mapping rule x ↦ 0. Simple exponential functions are also infinitely differentiable: Example 111. Consider j ∶ R → R defined by x ↦ ex . We have, for all x ∈ R, j ′ (x) = j ′′ (x) = j (3) (x) = j (4) (x) = ⋅ ⋅ ⋅ = ex .

The function j is infinitely-differentiable, with every derivative simply being the same function as j.

Page 127, Table of Contents

www.EconsPhDTutor.com

9.8

Differentiability Implies (i.e. is Stronger Than) Continuity

Informally, continuity is a “smoothness” condition — if a function is continuous, then its graph has no “holes” or “jumps” anywhere and can be drawn smoothly without lifting your pencil. Differentiability is a stronger “smoothness” condition. If a function is differentiable, then its graph is continuous (i.e. has no “holes” or “jumps”) and moreover has no “kinks” or other “abrupt turns”. Example 112. Graphed below are the functions f , g, and h. f is both continuous and differentiable. g is continuous — you can draw its entire graph without lifting your pencil. However, it is not differentiable because of the “kink”. h is neither continuous nor differentiable, because of the “hole”.

y h is neither continuous nor differentiable. f is both continuous and differentiable.

x

g is continuous, but not differentiable.

Theorem 1. If f ∶ D → R is differentiable at a ∈ D, then f is continuous at a ∈ D. Proof. Optional, see p. 887 in the Appendices. Page 128, Table of Contents

www.EconsPhDTutor.com

9.9

Implicit Differentiation

Example 113. Consider the equation x2 + y 2 = 1. What is

dy ? dx

√ Method #1. First write y in terms of x: y = ± 1 − x2 . Then differentiate: dy −2x −x ∓x =± √ = ±√ =√ . dx 2 1 − x2 1 − x2 1 − x2

d to the given equation: dx d d dy dy x (x2 + y 2 ) = (1) ⇐⇒ 2x + 2y = 0 Ô⇒ =− . dx dx dx dx y

Method #2 (implicit differentiation). Directly apply

√ If desired, we can plug in y = ± 1 − x2 to get the same answer as before: dy x ∓x =− √ =√ . dx ± 1 − x2 1 − x2

In the above example, the second method (implicit differentiation) is not obviously superior to the first. However, it is sometimes difficult (or impossible) to express y in terms of x. Nonetheless we might still want to compute dy/dx. In such cases, the method of implicit differentiation is wonderful. The next example illustrates: √ Example 114. Consider the equation x2 y + when evaluated at x = 0)?

y dy dy = 1. What is ∣ (i.e. what is cos x dx x=0 dx

In this example, it’s difficult to express y in terms of x. But this doesn’t matter, because we can use implicit differentiation: √ √ y d 1 dy y(− sin x) − cos x dx d (x2 y + ) = (1) ⇐⇒ 2x y + x2 √ + = 0. dx cos x dx 2 y dx cos2 x dy

Now plug in x = 0:

√ dy 1 dy y(− sin 0) − cos 0 dx 2 ⋅ 0 y + 02 √ + = 0 ⇐⇒ = 0. 2 y dx cos2 0 dx

Page 129, Table of Contents

dy

www.EconsPhDTutor.com

The four rules of differentiation in the next corollary are in the List of Formulae you get during A-level exams (both 9740 and 9758), so you need not know these by heart. d 1 d −1 d sec x = sec x tan x, sin−1 x = √ , cos−1 x = √ , and dx dx 1 − x2 dx 1 − x2 d 1 tan−1 x = . dx 1 + x2

Corollary 2.

d 1 , first rewrite y = sin−1 x as x = sin y. Next sin−1 x = √ 2 dx 1−x d dy then apply (implicit differentiation) to get 1 = cos y . But sin2 y + cos2 y = 1, so dx √ dx 2 cos y = 1 − x . And so, Proof. To prove that

dy d 1 1 . = sin−1 x = =√ dx dx cos y 1 − x2

Exercise 62 asks you the prove the derivatives of sec x, cos−1 x and tan−1 x are as claimed. d d −1 d , and sec x = sec x tan x, cos−1 x = √ tan−1 x = dx dx dx 1 − x2 1 . (Answer on p. 951.) 1 + x2

Exercise 62. Prove that

Page 130, Table of Contents

www.EconsPhDTutor.com

10 10.1

Increasing, Decreasing, and f ′

When a Function is Increasing or Decreasing

Example 115. Consider the function f ∶ R → R defined by x ↦ x2 . It is decreasing on R−0 , increasing on R+0 , strictly decreasing on R− , and strictly increasing on R+ .

y Decreasing on Strictly decreasing on

Increasing on Strictly decreasing on

x

Both increasing and decreasing at x = 0.

Note: At x = 0, f is both decreasing and increasing, but neither strictly decreasing nor strictly increasing. This follows from the formal definitions (below).

Definition 41. Given a function f and a set of points S, we say that f is ... 1. ... increasing on S if for any x1 , x2 ∈ S with x2 > x1 , we have f (x2 ) ≥ f (x1 ); 2. ... strictly increasing on S if for any x1 , x2 ∈ S with x2 > x1 , we have f (x2 ) > f (x1 );

3. ... decreasing on S if for any x1 , x2 ∈ S with x2 > x1 , we have f (x2 ) ≤ f (x1 ); 4. ... strictly decreasing on S if for any x1 , x2 ∈ S with x2 > x1 , we have f (x2 ) < f (x1 );

Of course, if a function is strictly increasing on a set of points, then it is also increasing on that set. And if it is strictly decreasing, then it is also decreasing. Exercise 63. Let g ∶ R → R defined by x ↦ sin x. Identify the sets on which which g is increasing, decreasing, strictly increasing and/or strictly decreasing. (Answer on p. 952.)

Page 131, Table of Contents

www.EconsPhDTutor.com

10.2

The First Derivative Increasing/Decreasing Test

The derivative is the slope of the tangent. And so not surprisingly, the derivative is intimately related to whether a function is increasing or decreasing. Formally: Fact 10. Let f ∶ R → R be a differentiable function. Let a, b ∈ R with b > a. Then

1. f is decreasing on (a, b) ⇐⇒ f ′ (x) ≥ 0, for all x ∈ (a, b). 2. f is increasing on (a, b) ⇐⇒ f ′ (x) ≤ 0, for all x ∈ (a, b).

3. f is strictly decreasing on (a, b) ⇐⇒ f ′ (x) < 0, for all x ∈ (a, b). 4. f is strictly increasing on (a, b) ⇐⇒ f ′ (x) > 0, for all x ∈ (a, b). 5. f is both increasing and decreasing at a ⇐⇒ f ′ (a) = 0. Proof. Optional, see p. 888 in the Appendices. Example 131 (revisited). Consider again f ∶ R → R defined by x ↦ x2 . 1. f is decreasing on R−0 , and so f ′ (x) ≤ 0 for x ≤ 0.

2. f is increasing on R+0 , and so f ′ (x) ≥ 0 for x ≥ 0. 3. f is strictly decreasing on R−0 , and so f ′ (x) < 0 for x ≤ 0. 4. f is strictly increasing on R+0 , and so f ′ (x) > 0 for x ≥ 0.

5. f is both increasing and decreasing at x = 0, and so f ′ (x) = 0.

y

, for

, for

.

Both increasing and decreasing at x = 0:

Page 132, Table of Contents

.

, for

.

, for

.

x

www.EconsPhDTutor.com

11 11

Extreme, Stationary, Stationary, and and Turning Turning Points Points Extreme, 11.1 11.1

Maximum and and Minimum Minimum Points Points Maximum

23 Let ff ∶∶ D D→ →R R and and x x ∈∈ D. D. Informally: Informally:23 Let

1. If f (x) ≥ f (a) for all a ∈ D that are “close to” x, then we call x a maximum point of f and f (x) a maximum value. 2. If f (x) ≤ f (a) for all a ∈ D that are “close to” x, then we call x a minimum point of f and f (x) a minimum value. 3. If f (x) > f (a) for all a ∈ D that are “close to” x, then we call x a strict maximum point of f and f (x) a strict maximum value. 4. If f (x) < f (a) for all a ∈ D that are “close to” x, then we call x a strict minimum point of f and f (x) a strict minimum value.

Of course, a strict maximum point is also a maximum point. And a strict minimum point is also a minimum point. Any maximum or minimum point is also known as an extremum (plural: extrema) or an extreme point.

Example 116. Graphed below is f ∶ R → R defined by f (x) = −(x − 1)2 . x = 1 is a maximum point and a strict maximum point of f . The corresponding (strict) maximum value is f (1) = 0. Also graphed is g ∶ R → R defined by g(x) = (x + 1)2 . x = 1 is a minimum point and a strict minimum point of g. The corresponding (strict) minimum value is g(1) = 0.

23 23 See

p. 888 in the Appendices for the formal definitions. See p. 888 in the Appendices for the formal definitions.

Page 133, Table of Contents

www.EconsPhDTutor.com

A function can have multiple maximum and multiple minimum points: Example 117. Graphed below is h ∶ R → R defined by x ↦ 6x5 − 15x4 − 10x3 + 30x2 .

• x = −1 is a maximum point and a strict maximum point of h. The corresponding maximum value (and also strict maximum value) is h(−1) = 19. • x = 1 is a maximum point and a strict maximum point of h. The corresponding maximum value (and also strict maximum value) is h(1) = 11.

• x = 0 is a minimum point and a strict minimum point of h. The corresponding minimum value (and also strict minimum value) is h(0) = 0.

• x = 2 is a minimum point and a strict minimum point of h. The corresponding minimum value (and also strict minimum value) is h(2) = −8.

y x = ±1 maximum points

x -2

-1

0

1

2

3

x = 0, 2 minimum points

Page 134, Table of Contents

www.EconsPhDTutor.com

The next example highlights the fact that a maximum point is sometimes not a strict maximum point. Likewise with minimum points. Example 118. Below is graphed i ∶ R → R defined by x ↦ 3. (This is a constant function.)

• Every point x ∈ R is a maximum point of i. The corresponding maximum value is always i(x) = 3. • But no point is a strict maximum point. • Every point x ∈ R is a minimum point of i. The corresponding minimum value is always i(x) = 3. • But no point is a strict minimum point.

y

Every point is a maximum point.

Every point is a minimum point. x -2

Page 135, Table of Contents

-1

0

1

2

3

www.EconsPhDTutor.com

11.2

Global Maximum and Minimum Points

Definition 42. Let f ∶ D → R and a ∈ D.

1. If f (a) ≥ f (x) for all x ∈ D, we call a the global maximum point of f and f (a) the global maximum value. 2. If f (a) ≤ f (x) for all x ∈ D, we call a the global minimum point of f and f (a) the global minimum value.

3. If f (a) > f (x) for all x ∈ D/{a}, we call a the strict global maximum of f and f (a) the strict global maximum value. 4. If f (a) < f (x) for all x ∈ D/{a}, we call a the strict global minimum of f and f (a) the strict global minimum value. The next fact is perhaps obvious: Fact 11. There cannot be more than one strict global maximum point of a function. (Similarly, there cannot be more than one strict global minimum point of a function.) Proof. Suppose for contradiction that two distinct points x1 and x2 are strict global maximum points of f . Then since x1 is a strict global maximum point, we have f (x1 ) > f (x2 ). Similarly, since x2 is a strict global maximum point, we have f (x2 ) > f (x1 ). The two inequalities are contradictory. So it is impossible that two distinct points x1 and x2 are strict global maximum points of f .

Page 136, Table of Contents

www.EconsPhDTutor.com

Example 117 (revisited). Consider again the function h ∶ R → R defined by x ↦ 6x5 − 15x4 − 10x3 + 30x2 . (Graph reproduced below for convenience.)

x = ±1 are maximum points. However, they are not global maximum points. Indeed, h has no global maximum point because lim h(x) = ∞ (“as x increases without bound, h(x) x→∞ also increases without bound”). In other words, there is no x such that h(x) ≥ h(a) for all a ∈ R.

Similarly, x = 0, 2 are minimum points. However, they are not global minimum points. Indeed, h has no global minimum point because lim h(x) = −∞ (“as x decreases without x→−∞ bound, h(x) also decreases without bound”). In other words, there is no x such that h(x) ≤ h(a) for all a ∈ R.

y x = ±1 maximum points

x -2

-1

0

1

2

3

x = 0, 2 minimum points

We next restrict the domain of h in two ways to create two new functions i and j:

Page 137, Table of Contents

www.EconsPhDTutor.com

Example 117 (revisited). Graphed below (left) is the function i ∶ [−1.5, 2.5] → R defined by x ↦ 6x5 − 15x4 − 10x3 + 30x2 .

i has three maximum points in total, namely ±1, 2.5. However, only 2.5 is a global maximum point of i because only i(2.5) ≥ i(x) for all x ∈ [−1.5, 2.5]. Of course, it is also a strict global maximum point because i(2.5) > i(x) for all x ∈ [−1.5, 2.5].

i has three minimum points in total, namely −1.5, 0, 2. However, only −1.5 is a global maximum point of i because only i(−1.5) ≤ i(x) for all x ∈ [−1.5, 2.5]. Of course, it is also a strict global minimum point because i(−1.5) < i(x) for all x ∈ [−1.5, 2.5].

y

x = ±1 max

y

x = 2.5 max and global max

x = -1 max and global max x = 1, 1.2 max

x -2

-1

0 1 2 x = -1.5 min and global min x = 0, 2 min

3

x -2

-1

x = -1.2, 0 min

0

1

2

3

x = 2 min and global min

Also graphed above (right) is the function j ∶ [−1.2, 2.2] → R defined by x ↦ 6x5 − 15x4 − 10x3 + 30x2 .

Again, there are three maximum points in total, namely ±1, 2.2. However, only −1 is a global maximum point of j because only j(−1) ≥ j(x) for all x ∈ [−1.2, 2.2]. Of course, it is also a strict global maximum point because j(−1) > i(x) for all x ∈ [−1.2, 2.2]. And again, there are three minimum points in total, namely −1.2, 0, 2. However, only 2 is a global minimum point of j because only j(2) ≤ j(x) for all x ∈ [−1.2, 2.2]. Of course, it is also a strict global minimum point because j(2) < j(x) for all x ∈ [−1.2, 2.2].

Page 138, Table of Contents

www.EconsPhDTutor.com

Note that the A-level syllabuses and exams only ever talk about maximum and minimum points. They do not ever talk about 1. Strict maximum points; 2. Strict minimum points; 3. Global maximum points; 4. Global minimum points; 5. Strict global minimum points; and 6. Strict global maximum points. Nonetheless, these concepts are not difficult to grasp. It is thus well worth learning them, just so you have a better understanding of how to find maximum and minimum points. Note also that what we simply call maximum and minimum points are sometimes instead called local maximum and minimum points, so that they are better contrasted with global maximum or minimum points.

Exercise 64. (Answer on p. 952.) For each of the following functions, write down, if any of these exist, the (i) maximum points, (ii) minimum points, (iii) strict maximum points, (iv) strict minimum points, (v) global maximum points, (vi) global minimum points, (vii) strict global maximum points, (viii) strict global minimum points; and also all the corresponding values of the function at these points. (a) f ∶ R → R defined by x ↦ 100. (b) g ∶ R → R defined by x ↦ x2 . (c) h ∶ [1, 2] → R defined by x ↦ x2 .

Page 139, Table of Contents

www.EconsPhDTutor.com

11.3

Stationary and Turning Points

Definition 43. A point x is a stationary point of f if f ′ (x) = 0.

Graphically, a stationary point is where the slope of the tangent is 0 (flat).

Definition 44. A turning point is any point that is both a stationary point and a maximum or minimum point.

So every turning point is both a stationary point and an extreme point. But the converse is not true: A stationary point need not always be a turning point. And an extreme point need not always be a turning point.

Page 140, Table of Contents

www.EconsPhDTutor.com

Example Example 119. Graphed below is the function f ∶ [−1.5, 0.5] → R defined by x ↦ x5 +2x4 +x3 . Five Five points are labelled. The table below classifies each point. D D is is aa stationary stationary point point but not a turning turning point. (As we shall learn in Section Chapter147, 12, D is an example example of an inflexion point.) A A is a minimum point and E is a maximum point. But neither is a turning point. Type Max Min Strict Max Strict Min Global Max Global Min Strict Global Max Strict Global Min Stationary Turning

A B ✓ ✓ ✓ ✓ ✓ ✓

C D E ✓ ✓ ✓ ✓ ✓

✓ ✓ ✓ ✓ ✓

✓

Exercise of the the following followingstatements statementstrue trueororfalse? false?To Toshow showthat thata astatement statement Exercise 65. Is each of is counterexample from from the theabove aboveexample. example.IfIfititisistrue, true,explain explainwhy. why. is false, simply give aa counterexample (Answer on p. 953.) (Answer (a) Every maximum point or minimum point is a stationary point. (a) (b) Every maximum point or minimum point is a turning point. (b) (c) Every stationary point is a maximum point or minimum point. (c) (d) Every turning point is a maximum point or minimum point. (d) (e) Every turning point is a stationary point. (e) (f) Every stationary point is a turning point. (f)

Page 141, 141, Table Table of of Contents Contents Page

www.EconsPhDTutor.com

11.4

The Interior Extremum Theorem

Informally, a point x ∈ S is in the interior of a set S if x is not at the “edge” of S. Formally:

Definition 45. x ∈ S is in the interior of S if there exists δ such that (x − δ, x + δ) ∈ S. x ∈ S is a non-interior point of S if it is not in the interior of S.

Example 120. Consider the set S = [0, 1]. The points 0.2, 1/3, and 0.775 are all in the interior of S. Indeed, every point x ∈ (0, 1) is in the interior of S. In contrast, the points 0 and 1 are non-interior points of S.

Example 121. Consider the set S = [0, 0.5) ∪ (0.5, 1]. The points 0.2, 1/3, and 0.775 are all in the interior of S. Indeed, every point x ∈ (0, 0.5) ∪ (0.5, 1) is in the interior of S.

In contrast, the points 0 and 1 are non-interior points of S.

The point 0.5 is not in the interior of S. It is not even a non-interior point of S, because it is not in the set S to begin with.

Page 142, Table of Contents

www.EconsPhDTutor.com

1. If f (x) ≥ f (a) for all a ∈ D that are “close to” x, then we call x a maximum point of f and f (x) a maximum value. The Interior Extremum Theorem (IET) is the fundamental reason why we lurrrve taking 2. If f (x) ≤and f (a) for allthem a∈D thattoare “close to”isx, then way we call x a maxima minimum of derivatives setting equal zero — this a great to find andpoint minima! f and f (x) a minimum value. 3. If f (x) > f (a) for all a ∈ D that are “close to” x, then we call x a strict maximum Theorem Theorem [IET].) Let f ∶ D → R be a differentiable point of2. f (Interior and f (x) aExtremum strict maximum value. function. is afor maximum orthat minimum pointto” AND in theweinterior D, then f ′ (a) = 0 4. If f (x) If< af (a) all a ∈ D are “close x, then call x of a strict minimum (i.e. c is aofstationary point). point f and f (x) a strict minimum value. Of course, a strict maximum point is also a maximum point. And a strict minimum point Proof. p. 889 in the Appendices. is also aOptional, minimumsee point. Any maximum or minimum point is also known as an extremum (plural: extrema) or an extreme point. Here’s a non-rigorous explanation of the intuition behind the IET: 2 2 − 1) . Here’s Example 133 Graphed below is fdefined ∶ R → Rbydefined by−(x x ↦−−(x Example 116.(revisited). Graphed below is f ∶ R → R f (x) = 1) . x = 1 is a ′ the intuition for why f (0) = 0: maximum point and a strict maximum point of f . The corresponding (strict) maximum value is f (1) = 0. In order for 1 to be a maximum point of f , it must be that to its left, f is increasing; while Also ∶ R → R defined by words, g(x) = (x 1)2left . x of = 11,isf a′ (x) minimum point a strict to itsgraphed right, f is is gdecreasing. In other to + the ≥ 0. While to and the right of ′ ′ minimum of g. Thethen, corresponding (strict) is g(1) = 0.point, the slope 1, f (x) ≤ point 0. Altogether we must have f (1)minimum = 0 — atvalue the maximum of the function must be 0.

23

See p. 888 in the Appendices for the formal definitions.

Exercise 66. Refer to the above Example. Explain the intuition for why g ′ (−1) = 0. (Answer on p. 953.)

Page 133, Table of Contents

www.EconsPhDTutor.com

Exercise 67. True or false: “Let f ∶ D → R be a differentiable function. If c is a maximum or minimum point AND in the interior of D, then x is a turning point.” (Answer on p. 953.)

Page 143, Table of Contents

www.EconsPhDTutor.com

11.5

How to Find Maximum and Minimum Points

In secondary school, you may have been taught that to find the maximum and minimum points of f , simply follow this procedure: The Incorrect Recipe for Finding Maximum and Minimum Points. Given a differentiable function f ∶ D → R,

1. Compute f ′ (x). Find the points x at which f ′ (x) = 0.

2. These points are also the maximum and minimum points. (If we also want to know which are maximum and which are minimum points, then simply employ some method like sketch-the-graph or the Second Derivative Test.)

Unfortunately, the above procedure (let’s call it the Incorrect Recipe) may sometimes fail. It rests on the false belief that “f ′ (x) = 0 ⇐⇒ x is an extremum”. This is false because 1. The IET does NOT say, “f ′ (x) = 0 Ô⇒ x is an extremum.” It is perfectly possible that f ′ (x) = 0 without x being an extremum.

2. The IET does NOT say, “x is an extremum Ô⇒ f ′ (x) = 0 .” Instead, it says, “x is an extremum AND an interior point Ô⇒ f ′ (x) = 0.” Thus, it is perfectly possible that x is an extremum without f ′ (x) = 0.

Here is an example to illustrate these two failings of the Incorrect Recipe.

Page 144, Table of Contents

www.EconsPhDTutor.com

Example 141 (revisited). Graphed below is the function f ∶ [−1.5, 0.5] → R defined by x ↦ x5 + 2x4 + x3 . Five points are labelled. According to the Incorrect Recipe,

1. Compute f ′ (x) = 5x4 + 8x3 + 3x2 = x2 (5x2 + 8x + 3) = x2 (5x + 3)(x + 1). We see that 3 f ′ (x) = 0 ⇐⇒ x = − , −1, 0. 5 3 2. So − , −1, 0 are the maximum and minimum points of f . 5

4 3 ow isThe theIncorrect functionRecipe f ∶ [−1.5, → R identify definedthe by xpoints ↦ x5B+2x does 0.5] correctly = +x (−1, .f (−1)) and C = 3 3 e table each f (− ))classifies as maximum andpoint. minimum points, respectively. But it makes two mistakes. (− ,below

5

5

not aMistake turning (As weneither shall alearn in Section 147, D is an #1:point. D = (0, 0) is maximum nor a minimum point, t.) Incorrect Recipe.

contrary to the

#2: A and E are respectively a minimum and a maximum point, E is Mistake a maximum point. But neither is a turning point. detected by the Incorrect Recipe.

but neither is

B C D E ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓

✓

We now give the Correct Recipe for finding maximum and minimum points:

e following statements true or false? To show that a statement terexample from the above example. If it is true, explain why.

or minimum point is a stationary point. Page 145, Table of Contents

www.EconsPhDTutor.com

The Correct Recipe for Finding Maximum and Minimum Points. Given a differentiable function f ∶ D → R,

1. Identify all the stationary points (i.e. x where f ′ (x) = 0).

2. Identify all the non-interior points. 3. Investigate whether each point identified is a max, min, or something else. The Correct Recipe rectifies the Incorrect Recipe in two ways: 1. The Correct Recipe demands that you also check the non-interior points, which may possibly be extrema, but may be overlooked by the Incorrect Recipe. 2. The Correct Recipe does not assume that every single one of our shortlist of points (the stationary points and the non-interior points) is either a maximum point or a minimum point. It allows for the possibility that some of these points could be neither.

By the way, the condition that f is differentiable is very important. If f is not differentiable, then the above Correct Recipe might not work. But not to worry, since most functions on the A-levels are usually differentiable. Example 122. Consider f ∶ [−1, 1] → R defined by x ↦ x3 . Let’s apply the Correct Recipe. 1. Identify all the stationary points (i.e. x where f ′ (x) = 0).

f ′ (x) = 3x2 . So f ′ (x) = 0 ⇐⇒ x = 0. The only stationary point is x = 0.

2. Identify all the non-interior points.

Every point x ∈ (−1, 1) is in the interior of [−1, 1]. The only non-interior points are ±1. 3. Check if each of these points is a maximum point, a minimum point, or neither.

From a graph sketch, the stationary point x = 0 is neither a max nor a min point. The non-interior point −1 is a minimum point. The non-interior point 1 is a maximum point.

Altogether, we conclude that −1 is the only minimum point and 1 is the only maximum.

Exercise 68. For each of the following functions, find all the maximum and minimum points using the Correct Recipe. (Answer on p. 954.) (a) f ∶ R → R defined by x ↦ x. (b) g ∶ [0, 1] → R defined by x ↦ x. 4 2 (c) h ∶ R → R defined by x ↦ x − 2x . Identify (if any) the global minimum point(s). Page 146, Table of Contents

www.EconsPhDTutor.com

12

Concavity, Inflexion Points, and the 2DT

• A function is concave downwards (or simply concave) on an interval if the line segment connecting any two points of the graph in this interval is below the graph. • A function is concave upwards (or simply convex) on an interval if the line segment connecting any two points of the graph in this interval is above the graph. • An inflexion point is any point where the concavity of the function changes, either from downwards to upwards, or upwards to downwards.24

Example 123. Graphed below is f ∶ R → R defined by x ↦ x3 .

f is concave downwards on R−0 because there, the line segment connecting any two points on f is below the graph of f .

y Tangent line at x = 0 is concave upwards on x -2

-1

0

1

2

is concave downwards on

In contrast, f is concave upwards on R+0 because there, the line segment connecting any two points on f is above the graph of f . 0 is an inflexion point because this is where the function f changes from being concave downwards to being concave upwards. A test for whether a point is an inflexion point is this: Draw the tangent line to the graph at that point. The point is an inflexion point ⇐⇒ The line is above the graph on one side of the point and below the graph on the other side (see Fact 95 in the Appendices).

The tangent line to the graph at the point 0 is drawn in green (it coincides with the horizontal axis). We indeed see that the line is above the graph on the left side of the point and below the graph on the right side of the point. Therefore, 0 is an inflexion point. 24

These are informal definitions. For the formal definitions, see p. 890 in the Appendices (optional).

Page 147, Table of Contents

www.EconsPhDTutor.com

For a graph to be concave downwards, its slope must be decreasing. Conversely, to be concave upwards, its slope must be increasing. Altogether then, the following proposition is intuitively plausible. Proposition 2. Let f ∶ D → R be a twice-differentiable function. (a) f is concave downwards on an interval ⇐⇒ f ′′ (x) ≤ 0 for every x in this interval. (b) f is convex upwards on an interval ⇐⇒ f ′′ (x) ≥ 0 for every x in this interval. (c) x is an inflexion point Ô⇒ f ′′ (x) = 0. Proof. Optional, see p. 893 in the Appendices.

Example 147 (revisited). Consider f ∶ R → R defined by x ↦ x3 . f is concave downwards on R−0 , concave upwards on R+0 , and has an inflexion point at x = 0. We can verify that, as per the above proposition: ⎧ ⎪ ⎪ < 0, ⎪ ⎪ ⎪ ⎪ f ′ (x) = 3x2 ⎨= 0, ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩> 0,

Page 148, Table of Contents

for x ∈ R−0 , for x = 0,

for x ∈ R+0 .

www.EconsPhDTutor.com

It is very tempting to believe that the converse of part (c) of the above proposition is true. That is, it is very tempting to believe that “f ′′ (x) = 0 Ô⇒ x is an inflexion point.”

But this is wrong! It is perfectly possible that f ′′ (x) = 0 without x being an inflexion point! Here’s an example: Example 124. Consider g ∶ R → R defined by x ↦ x4 . We have g ′ (x) = 4x3 and g ′′ (x) = 12x2 , so that g ′′ (x) = 0 ⇐⇒ x = 0.

We might thus be tempted to conclude that 0 is an inflexion point. However, this is not the case. Although g ′′ (0) = 0, we have g ′′ (x) > 0 for x > 0 and we also have g ′′ (x) > 0 for x < 0, and so the concavity of g does not change at the point 0. To qualify as an inflexion point, the concavity of the function must change. At 0, the concavity of g does not change. Therefore, 0 is NOT an inflexion point.

y g is concave upwards everywhere

. However, is not an inflexion point of . x

Page 149, Table of Contents

www.EconsPhDTutor.com

Inflexion points may be further sub-divided into stationary points of inflexion and non-stationary points of inflexion. Definition 46. A stationary point of inflexion is simply any point that is both an inflexion point and a stationary point. A non-stationary point of inflexion is simply any point that is an inflexion point, but not a stationary point. The A-level syllabuses explicitly exclude non-stationary points of inflexion. Nonetheless, there is the temptation to believe that “every inflexion point must also be a stationary point”. Here’s a quick counter-example to dispel this false belief: Example 125. The graph below is for the function f ∶ R → R defined by x ↦ x3 + x. We have f ′ (x) = 3x2 + 1 and f ′′ (x) = 6x. The point 0 is not a stationary point because f ′ (0) = 1 ≠ 0.

However, 0 is an inflexion point, because to the left of 0, f is concave downwards; and to the right, f is concave upwards. So 0 is a point of inflexion. Indeed, it is a non-stationary point of inflexion. Also illustrated is the tangent line at y = x (whose slope is indeed non-zero). Observe that indeed, to the left of 0, the tangent line is above the graph; while to the right of 0, the tangent line is below the graph. This serves as a second way to verify that 0 is a point of inflexion.

y

Concave upwards on

x Tangent line at 0 Concave downwards on

Page 150, Table of Contents

www.EconsPhDTutor.com

12.1

The Second Derivative Test (2DT)

y

y

x

f must be concave upwards around the minimum turning point 0. So

f must be concave downwards around the maximum turning point 0.

for all x near 0.

So

for all x near 0.

x

From graphs, it looks like around a maximum turning point a, f must be concave downwards, i.e. f ′′ (a) < 0. Similarly, around a minimum turning point b, f must be concave upwards, i.e. f ′′ (b) > 0. The next proposition is thus intuitively plausible. Proposition 3. (Second Derivative Test [2DT].) Let f be a twice-differentiable function. Let a be a stationary point (i.e. f ′ (a) = 0). 1. If f ′′ (a) < 0, then a is a maximum point.

2. If f ′′ (a) > 0, then a is a minimum point. 3. If f ′′ (a) = 0, then the 2DT is uninformative. That is, a could be a maximum point, a minimum point, an inflexion point, or something else altogether! Proof. Optional, see p. 894 in the Appendices.

The third part of the above Proposition must be heavily emphasised: If f ′ (a) = 0 and f ′′ (a) = 0, then the 2DT tells us absolutely nothing about a! a could be a maximum point, a minimum point, an inflexion point, or something else altogether!

We previously gave the Correct Recipe for finding maximum and minimum points. Let’s now add the 2DT to this recipe:

Page 151, Table of Contents

www.EconsPhDTutor.com

The Enriched Recipe for Finding Maximum and Minimum Points. Given a twice-differentiable function f ∶ D → R,

1. Identify all the stationary points (i.e. a where f ′ (a) = 0).

(a) Evaluate f ′′ at each of these points. (b) f ′′ (a) < 0 Ô⇒ a is a maximum point. Conversely, f ′′ (a) > 0 Ô⇒ x is a minimum point. If f ′′ (a) = 0, then we need to determine the nature of a using some other method (e.g. sketch-the-graph).

2. Identify all the non-interior points.

(a) Check if each of these points is a maximum point, a minimum point, or neither.

If f is not twice-differentiable, then the Enriched Recipe may not work. Fortunately, most functions in A-levels are twice-differentiable. Example 119 (revisited). Consider f ∶ [−1.5, 0.5] → R defined by x ↦ x5 + 2x4 + x3 .

1. Identify all the stationary points. f ′ (x) = 5x4 + 8x3 + 3x2 = x2 (5x2 + 8x + 3) = 0 ⇐⇒ x = 0 or x = −1, −0.6 (quadratic formula).

(a) f ′′ (x) = 20x3 + 24x2 + 6x = 2x(10x2 + 12x + 3). (b) f ′′ (−0.6) > 0 Ô⇒ −0.6 is a minimum point. f ′′ (−1) < 0 Ô⇒ −1 is a maximum point. But f ′′ (0) = 0, so the 2DT tells us nothing. From a graph sketch, we see that 0 is an inflexion point.

2. The only two non-interior points are −1.5 and 0.5. Again by sketching the graph, we see that −1.5 is a minimum point and 0.5 is a maximum point.

Altogether, we conclude that there are two maximum points — −1 and 0.5 — and two minimum points — −0.6 and −1.5. Exercise 69. Use the Enriched Recipe to find the maximum and minimum points of each of the following functions. (Answer on p. 956.) (a) g ∶ R → R defined by x ↦ x8 + x7 − x6 . π π (b) h ∶ (− , ) → R defined by x ↦ tan x. 2 2 (c) i ∶ [0, 2π] → R defined by x ↦ sin x + cos x.

Page 152, Table of Contents

www.EconsPhDTutor.com

12.2

Summary of Points and Venn Diagram

The Venn diagram below depicts the five types of points you need to know for the A-levels: Inflexion, maximum, minimum, stationary, and turning points. To its right is a graph of a rather-arbitrary function t ∶ D → R designed to illustrate these various points. The x- and y-coordinates of a are denoted ax and ay ; similarly for other points.

a b

Inflexion

All points

c

y e

d

Stationary

j

i f h

e

Turning

g c

g

f h

Max

i

j

Min

b a x

• For most functions you’ll ever encounter, most points are like a. For lack of a better name, we can call such points boring points — a boring point is simply any point that is not an inflexion, maximum, minimum, stationary, or turning point. • b is a non-stationary point of inflexion (explicitly excluded from the A-levels). • c is a stationary point of inflexion. • A point like d (not illustrated) — a stationary point that is not a maximum, minimum, or inflexion point — is extremely unusual. You can find an exotic example on p. 894. • f is both a maximum and minimum point because for all x ∈ D that are “close to” fx ∈ D, we have t(x) ≤ t (fx ) ≤ t(x). • The set of turning points is simply the intersection of the set of stationary points and the set of maximum and minimum points. • h is a maximum point because t(x) ≤ t (hx ) for all x ∈ D that are “close to” hx . • j is a minimum point because t(x) ≥ t (jx ) for all x ∈ D that are “close to” jx . • i is both a maximum and minimum point because there are simply no x ∈ D that are “close to” ix ∈ D, and thus it is trivially or vacuously true that t(x) ≤ t (ix ) ≤ t(x) for x that are “close to” x.25 i is not a stationary point because t′ (ix ) ≠ 0 — indeed, t′ (ix ) is undefined.26 25

26

A point like ix ∈ D that is not “close to” any other x ∈ D is, aptly enough, called an isolated point. ix is an example of a critical point. A critical point is any point that is either stationary or where the derivative is undefined. Don’t worry, not something you need to know for the A-levels.

Page 153, Table of Contents

www.EconsPhDTutor.com

Exercise 70. For each of the following equations, (i) sketch its graph. (ii) Write down the points at which it intersects the axes. (iii) Identify any turning points. (iv) Write down the equations of any lines of symmetry and also (v) asymptotes. (a) y = 2ex + x. (b) x = 3x + 2. (c) y = 2x2 + 1. (Answers on pp. 957, 958, and 959.)

Page 154, Table of Contents

www.EconsPhDTutor.com

13

Relating the Graph of f ′ to that of f

Given the graph of f ′ , you are required to know how to figure out what f looks like. Let’s start with a very simple example. Example 126. Let f ∶ R → R be some differentiable function. Graphed below in blue is its derivative f ′ . You are told also that f (0) = 2. What does the graph of f look like? (Pretend for a moment that you can’t see the red graph.)

y

1

x

The derivative simply gives the slope of f . Since f ′ (x) = 1 for all x, this means that f has constant slope of 1. We are given moreover that f (0) = 2 (i.e. the vertical intercept is 2). Altogether then, f (x) = x + 2 and is graphed in red above.

Page 155, Table of Contents

www.EconsPhDTutor.com

Example 127. Let g ∶ R− ∪ R+ → R be some differentiable function. Graphed below in blue is its derivative g ′ . You are told also that lim g(x) = −2. What does the graph of g look x→0 like? (Pretend for a moment that you can’t see the red graph.)

y

1

x

-1

The derivative simply gives the slope of g. Since g ′ (x) = −1 for all x < 0 and g ′ (x) = 1 for all x > 0, this means that g has constant slope of −1 for x < 0 and constant slope of 1 for all x > 0. We are given moreover that lim g(x) = −2, so the two branches of g nearly meet at (0, −2), with a hole there. Altogether then, x→0

⎧ ⎪ ⎪ ⎪−x − 2, g(x) = ⎨ ⎪ ⎪ ⎪ ⎩x − 2,

for x < 0, for x > 0.

Or more concisely, g(x) = ∣x∣ − 2. Graphed above in red is g.

Page 156, Table of Contents

www.EconsPhDTutor.com

Example 128. Let h ∶ R → R be some differentiable function. Graphed below in blue is its derivative h′ defined by h′ (x) = x. You are told also that h(x) = 0. What does the graph of h look like? (Pretend for a moment that you can’t see the red graph.)

y

x

The derivative simply gives the slope of h. Since h′ (x) < 0 for all x < 0, h′ (0) = 0, and h′ (x) > 0 for all x > 0, this means that h is strictly decreasing on R− , a turning point at 0, and strictly increasing on R+ . Moreover, the derivative (slope) is increasing (indeed it is increasing at a constant rate) — so the graph of h is concave upwards throughout.

Altogether then, even if we don’t know how to figure out what h(x) is, we can at least roughly sketch the graph of h (in red above below). (Of course, you probably already know from secondary school that h(x) = x2 /2, but we’re not supposed to know this until we learn about integration later in this textbook.)

Page 157, Table of Contents

www.EconsPhDTutor.com

14

Quick Revision: Quadratic Equations y = ax22 + bx + c

Quadratic equations show up very often in various contexts. So here is a fairly complete if brisk review of quadratic equations, which you were supposed to have completely mastered in secondary school. Example 129. Below are the graphs of the equations y = x22 + 3x + 1 (red), y = x22 + 2x + 1 (blue), y = x22 +x+1 (green), y = −x22 +x+1 (red dotted), y = −x22 −2x−1 (blue dotted), and y = −x2 − x − 1 (green dotted).

Page 158, Table of Contents Page

www.EconsPhDTutor.com www.EconsPhDTutor.com

Here’s a whirlwind study of the quadratic equation y = ax2 + bx + c. Assume that a ≠ 0, otherwise we are in the trivial case of a linear equation. First, write: b c ax2 + bx + c = a (x2 + x + ) . a a

To complete the square, observe that (x + k)2 = x2 + 2kx + k 2 and so b 2 b2 b x + x = (x + ) − . a 2a 4a 2

c b 2 b2 − 4ac b 2 b2 ]. Hence, ax + bx + c = a [(x + ) − 2 + ]= a [(x + ) − 2a 4a a 2a 4a2 2

What we just did above is called completing the square. We can use this to compute the zeros or roots of the equation ax2 + bx + c = 0. ax2 + bx + c = 0

b 2 b2 − 4ac b 2 b2 − 4ac = a [(x + ) − ] = (x + ) − 2a 4a2 2a 4a2 ⇐⇒

b 2 b2 − 4ac (x + ) = 2a 4a2

⇐⇒

x=

−b ±

√ b2 − 4ac . 2a

This last expression give the roots of the equation ax2 + bx + c = 0. This expression will NOT be printed in the A-Level List of Formulae! So be sure you remember it!

√ −b ± b2 − 4ac x= . 2a

Page 159, Table of Contents

www.EconsPhDTutor.com

We can distinguish between six categories of quadratic equations, based on the signs of a (the coefficient of x2 ) and b2 − 4ac (the discriminant). Each of these six categories was illustrated in the figure above. Category 1. a > 0, b2 − 4ac > 0 2. a > 0, b2 − 4ac = 0 3. a > 0, b2 − 4ac < 0 4. a < 0, b2 − 4ac > 0 5. a < 0, b2 − 4ac = 0 6. a < 0, b2 − 4ac < 0

Features ∪-shaped. Intersects the horizontal axis at two points. ∪-shaped. Just touches the horizontal axis at the minimum point. ∪-shaped. Doesn’t intersect the horizontal axis. ∩-shaped. Intersects the horizontal axis at two points. ∩-shaped. Just touches the horizontal axis at the maximum point. ∩-shaped. Doesn’t intersect the horizontal axis.

The vertical intercept (the value of f at 0) is simply c. The Sign of a. If a > 0, then the graph is ∪-shaped and has a minimum turning point b at x = − . Conversely, if a < 0, then the graph is ∩-shaped and has a maximum turning 2a b point at x = − . 2a The Discriminant. The term b2 − 4ac is called the discriminant. This name makes sense, because it helps us discriminate between several possible cases of the equation ax2 +bx+c = 0:

• If b2 − 4ac > 0, then:

– There are two real roots (or zeros or horizontal intercepts), namely √ −b ± b2 − 4ac . 2a – Moreover, we can write

ax2 + bx + c = (x −

−b +

√ √ b2 − 4ac −b + b2 − 4ac ) (x + ). 2a 2a

What we have just done is to factorise the expression ax2 + bx + c. Factorisation is often a useful trick to play. Notice that if you plug in either of the roots into the right hand side (RHS) of the above equation, we do indeed get zero, as expected. Page 160, Table of Contents

www.EconsPhDTutor.com

• If b2 − 4ac = 0, then:

– There is only one real root (or zero or horizontal intercept), namely −

– Moreover, we can write

b . 2a

−b 2 b 2 ax + bx + c = (x − ) = (x + ) . 2a 2a 2

b – Notice that if you plug x = − into the RHS of the above equation, we do indeed get 2a zero, as expected.

• If b2 − 4ac < 0, then:

– There are no real roots (or zeros or horizontal intercepts). – There is no way to factorise the expression ax2 +bx+c (unless we use complex numbers, which we’ll learn about only in Part IV).

Exercise 71. For each of the following equations, sketch its graph and identify its intercepts and turning points (if these exist). (a) y = 2x2 + x + 1. (b) y = −2x2 + x + 1. (c) y = x2 + 6x + 9. (Answer on p. 960.)

Page 161, Table of Contents

www.EconsPhDTutor.com

15

Transformations 15.1

y = f (x) + a

The graph of y = f (x) + a is simply the graph of y = f (x) translated (moved) upwards by a units. Example 130. Define the function f ∶ R → R by x ↦ x3 − 1. The graphs of f (red) and y = f (x) + 2 (blue) are shown below. Notice the blue curve is simply the red curve translated upwards by 2 units.

Page 162, Table of Contents

www.EconsPhDTutor.com

15.2

y = f (x + a)

The graph of y = f (x + a) is simply the graph of y = f (x) translated leftwards by a units.

Why leftwards (and not rightwards)? The reason is that in order for f (x1 ) and f (x2 + a) to hit the same value, we must have x2 = x1 − a. That is, every x value is moved to the left by a units. Example 131. Define the function f ∶ R → R by x ↦ x3 − 1. The graphs of f (red) and y = f (x + 2) (blue) are shown below. (The latter equation is simply y = (x + 2)3 − 1.) Notice the blue curve is simply the red curve translated leftwards by 2 units.

Page 163, Table of Contents

www.EconsPhDTutor.com

15.3

y = af (x)

The graph of y = af (x) is simply the graph of f (x) vertically-stretched (outwards from the horizontal axis) by a stretching factor of a. Example 132. Define the function f ∶ R → R by x ↦ x3 − 1. The graphs of f (red) and y = 2f (x) (blue) are shown below.

Notice the blue curve is simply the red curve stretched vertically (outwards from the horizontal axis) by a factor of 2.

Page 164, Table of Contents

www.EconsPhDTutor.com

15.4

y = f (ax)

The graph of y = f (ax) is simply the graph of f (x) horizontally-stretched (outwards from the vertical axis) by a stretching factor of 1/a. Or equivalently, the graph of y = f (ax) is simply the graph of f (x) horizontally-compressed (inwards towards the vertical axis) by a compression factor of a. Why a stretching factor of 1/a (and not a)? The reason is that in order for f (x1 ) and f (ax2 ) to hit the same value, we must have x2 = x1 /a. That is, every x value is scaled by a factor of 1/a.

Example 133. Define the function f ∶ R → R by x ↦ x3 − 1.The graphs of f (red) and y = f (2x) (blue) are shown below. (The latter equation is simply y = (5x)3 − 1 = 125x3 − 1.)

Notice the blue curve is simply the red curve stretched horizontally (outwards from the 1 vertical axis) by a factor of . (Again, the A-level exams might instead word this as a 2 stretch with scale factor 0.5 parallel to the y-axis.) Equivalently, the blue curve is simply the red curve compressed horizontally (inwards towards from the vertical axis) by a factor of 2.

Page 165, Table of Contents

www.EconsPhDTutor.com

15.5

Combinations of the Above

Example 134. Define the function f ∶ R → R by x ↦ x3 − 1. The graphs of f (red), y = 1.1f (x − 1) (blue), and y = f (1.1x) − 1 (green) are shown below. Notice the blue curve is simply the red curve translated rightwards by 1 unit and then stretching it vertically (outwards from the vertical axis) by a factor of 1.1.

Notice the green curve is simply the red curve stretched horizontally (outwards from the vertical axis) by a factor of 1/1.1 and then translated downwards by 1 unit.

Page 166, Table of Contents

www.EconsPhDTutor.com

15.6

y = ∣f (x)∣

The graph of y = ∣f (x)∣ is simply the graph of f (x), but with all points for which f (x) < 0 reflected in the horizontal axis. Example 135. Define the function f ∶ R → R by x ↦ x3 − 1. The graphs of f (red) and y = ∣f (x)∣ (blue) are shown below.

Page 167, Table of Contents

www.EconsPhDTutor.com

15.7

y = f (∣x∣)

The graph of y = f (∣x∣) is simply the graph of f (x), but with all points for which x < 0 reflected in the vertical axis. Example 136. Define the function f ∶ R → R by x ↦ x3 − 1. The graphs of f (red) and y = f (∣x∣) (blue) are shown below.

Page 168, Table of Contents

www.EconsPhDTutor.com

15.8

y=

1 f (x)

Example 137. Define the function f ∶ R → R by x ↦ x3 − 1. The graphs of f (red) and 1 (blue) are shown below. y= f (x)

1 . So f (x) in this case, x = 1 Ô⇒ f (x) = 0 and thus x = 1 is a vertical asymptote for the graph of 1 1 y= . As x approaches 1 from the left, → −∞. And as x approaches 1 from the f (x) f (x) 1 right, → ∞. f (x) 1 Also, if as x → ±∞, f (x) → ±∞, then we also have → 0, so that y = 0 is a horizontal f (x) asymptote. So here, as x → ∞, f (x) approaches 0 from above and as x → −∞, f (x) approaches 0 from below.

Notice that wherever f (x) = 0, we have a vertical asymptote for the graph of y =

Page 169, Table of Contents

www.EconsPhDTutor.com

15.9

y2 = f (x)

Three observations about the graph of y 2 = f (x):

1. It is symmetric in the horizontal axis. This is because if y1 satisfies y22 = f (x), then so too does −y1 . 2. If f (x) < 0, then there is no value of y for which y2 = f (x). And so the graph of y22 = f (x) is empty wherever f (x) < 0. 3. The graph of y 2 = f (x) intersects the horizontal axis at the same point as the graph of y = f (x). Moreover, at any such point, the tangent to the graph of y 2 = f (x) is vertical. Example 138. Define the function f ∶ R → R by x ↦ x3 − 1.The graphs of f (red) and y 2 = f (x) (blue) are shown below.

Page 170, Table of Contents

www.EconsPhDTutor.com www.EconsPhDTutor.com

Exercise 72. The graph of the function f ∶ R → R is drawn below in red. Graph each of Exercise 72. The graph of the function f ∶ R → R is drawn below in red. Graph each the following equations. (a) y = ∣2f (3x)∣. (b) y = f (∣x − 1∣). (c) y 2 =2f (x) + 4. (Answer on of the following equations. (a) y = ∣2f (3x)∣. (b) y = f (∣x − 1∣). (c) y = f (x) + 4. (Answer p. 961.) on p. 961.)

Exercise 73. Describe a series of transformations that would transform the graph of 1 1 a series of transformations that would transform the graph of y = 1 Exercise 73. Describe y = to y = 3 − . (Answer on p. 962.) x x 5x − 2 1 to y = 3 − . (Answer on p. 962.) 5x − 2

Page Page171, 171,Table TableofofContents Contents

www.EconsPhDTutor.com www.EconsPhDTutor.com

16

Conic Sections

Conic sections are formed from the intersection of a double cone and a 2D cartesian plane. Take an infinitely large double cone (it goes upwards and downwards forever). Use a 2D cartesian plane to slice the double cone from all conceivable positions and at all conceivable angles. The intersection of the plane and the surface of the double cone form curves which, aptly enough, are called conic sections. The figure below27 doesn’t show the upper half of the double cone, but you can easily imagine it. Of the four curves depicted, only the hyperbola also cuts the upper half of the double cone.

27

Taken from Wikipedia, which has an excellent page on conic sections.

Page 172, Table of Contents

www.EconsPhDTutor.com

The three types of conic sections are the ellipse (plural: ellipses), the parabola (parabolae), and the hyperbola (hyperbolae). The circle is regarded as a special case of the ellipse.28 Here are the distinguishing characteristics of each: Type Ellipse Parabola Hyperbola

Description Formed from only one half of the double cone. A closed curve.29 Formed from only one half of the double cone. Not a closed curve. Formed from both halves of the double cone and is thus composed of two distinct branches. Not a closed curve.

Arises when B 2 − 4AC < 0 B 2 − 4AC = 0

B 2 − 4AC > 0

We can prove (but do not do so in this textbook) that in general, a conic section is the graph of the equation Ax2 + Bxy + Cy 2 + Dx + Ey + F = 0, 1

where A, B, C, D, E, F are real constants and x and y are the two variables (on the cartesian plane). We refer to the expression B 2 − 4AC as the discriminant of the above equation. It is so named because it discriminates between the three possible types of conic sections. We can prove (but do not do so in this textbook) that if B 2 − 4AC > 0, then we have an ellipse; if B 2 − 4AC = 0, then we have a parabola; and if B 2 − 4AC < 0, then we have a hyperbola. In secondary school, we already learnt in some detail a special case of conic sections — the 1 quadratic y = ax2 + bx + c. This is the special case of the equation = where A = a, B= 0, C = 0, D= b, E = −1, and F = c.

The quadratic y = ax2 + bx + c is indeed a parabola, because B 2 − 4AC = 02 − 4(a)(0) = 0.

We already reviewed quadratic equations in section 14 and so we won’t talk any more about them in this chapter.

28

Strictly speaking, there are also the so-called degenerate conic sections, but we shall ignore these.

Page 173, Table of Contents

www.EconsPhDTutor.com

For A-levels, we are only required to learn about five more special cases of conic sections, listed below. And so that’s the plan for this chapter. 1. 2. 3. 4. 5.

x2 y 2 + = 1, a2 b2

x2 y 2 − = 1, a2 b2 y 2 x2 − = 1, b2 a2 y=

ax + b , cx + d

ax2 + bx + c . y= dx + e

Exercise 74. As per the general form given in =, state for each of the above five equations, what A, B, C, D, E, and F are. Compute the discriminant for each equation. Hence, conclude that first equation is of an ellipse and the remaining four are of hyperbolae. (Answer on p. 963.) 1

Page 174, Table of Contents

www.EconsPhDTutor.com

16.1

The Ellipse x2 + y 2 = 1 (The Unit Circle)

x2 y 2 The equation 2 + 2 = 1 describes an ellipse. In this section, we’ll study a special case of a b this equation, where a = b = 1. The equation then becomes x2 + y 2 = 1, which is the unit circle centred on the origin. By unit circle, we mean that it has radius of unit length, i.e. length 1.

Why does this equation describe a circle? You can easily see that (1, 0), (0, 1), (−1, 0), and (0, −1) all satisfy the equation and are thus part of its graph. Indeed, these are the horizontal and vertical intercepts. What about elsewhere on the circle?

Consider any point p on the unit circle. It forms a triangle — the line connecting it to the origin is the hypothenuse; that connecting it to the horizontal axis is the side; and that Page 175, 175, Table Table of Contents Page

www.EconsPhDTutor.com

connecting it to the vertical axis is the base. By the Pythagorean Theorem, x2 + y 2 = 12 = 1. We have just proven that every point (x, y) on the unit circle satisfies the equation x2 +y 2 = 1. We now examine some of its characteristics.

1. Intercepts. The graph intersects the vertical axis at the points (0, −1) and (0, 1) and the horizontal axis at the points (−1, 0) and (1, 0).

2. Turning points. In this case, it is easy to see that there is a maximum turning point at (0, 1) and a minimum turning point at (0, −1). But just as an exercise, let’s also try to find these turning points more rigorously, i.e. through calculus.

Exercise 49 showed that although it is impossible to rewrite the equation x2 + y 2 = 1 into the form of a single function, it is nonetheless possible√to and rewrite it into the form of two functions. Namely, f ∶ [−1, 1] → R defined by x ↦ 1 − x2 and g ∶ [−1, 1] → R defined √ by x ↦ − 1 − x2 . Above, the graph of the function f is the upper semicircle (red) and the graph of the function g is the lower semicircle (blue). Let’s compute the first derivative of f and set it equal to 0:

f ′ (x) = 0.5(1 − x2 )−0.5 (−2x) = − x(1 − x2 )−0.5 −x(1 − x2 )−0.5 = 0 Ô⇒ x = 0.

So the only stationary point of the function f is 0. We must now determine whether it is a maximum, minimum, or inflexion point. Compute the second derivative and evaluate it at the stationary point: f ′′ (x) = −(1 − x2 )−0.5 − x [−0.5(1 − x2 )−1.5 (−2x)] .

This second derivative is messy and can be further simplified, but in this case there is no need to simplify it, since all we want is to evaluate it at 0. We have f ′′ (0) = −(1 − 02 )−0.5 − 0 × [−0.5(1 − 02 )−1.5 (−2 × 0)] = −1 < 0.

Hence, the point x = 0 is a maximum turning point of f . We should make it a habit to write out the point in full, as (0, f (0)) = (0, 1).

Since g = −f , it follows that g ′ (0) = 0 and g ′′ (0) = 1 > 0. That is, the only stationary point of the function g is (0, g(0)) = (0, −1). And it is a minimum point.

3. Asymptotes. By observation, there are no asymptotes. 4. Symmetry. The graph is a perfect circle centred on the origin. So by observation, every line that passes through the origin is a line of symmetry!

Page 176, Table of Contents

www.EconsPhDTutor.com

16.2 16.2

x2 y 2 The Ellipse x22 + y22 = 1 The Ellipse a2 + b2 = 1 a b

Squares are a proper subset of rectangles. Similarly, circles are a proper subset of ellipses. The ellipse can be regarded as the generalisation of the circle. Why does the equation x2 /a2 + y 2 /b2 = 1 describe an ellipse? Rewrite the equation as y 2 x 2 ( ) + ( ) = 1. a b

Hence, going from x2 + y 2 = 1 to x2 /a2 + y 2 /b2 = 1 involves two transformations:

1. First, stretch the graph horizontally, outwards from the vertical axis, by a factor of a. 2. Then stretch the graph vertically, outwards from the horizontal axis, by a factor of b. This gives us an “elongated circle” that we call an ellipse. 1. Intercepts. The graph intersects the vertical axis at the points (0, −b) and (0, b), and the horizontal axis at the points (−a, 0) and (a, 0).

2. Turning points. Clearly, there are maximum and minimum turning points at (0, b) and (0, −b). Let’s find these rigorously using calculus. Page 177, Table of Contents Page 177, Table of Contents

www.EconsPhDTutor.com www.EconsPhDTutor.com

Let’s again break the equation√up and rewrite it into the form of two functions. √ Namely, f ∶ 2 2 [−a, a] → R defined by x ↦ ∣b∣ 1 − x /a and g ∶ [−a, a] → R defined by x ↦ −∣b∣ 1 − x2 /a2 . These are graphed above. Let’s compute the first derivative of f and set it equal to 0: x2 f (x) = 0.5∣b∣ (1 − 2 ) a ′

−0.5

−2x x2 ( 2 ) = −∣b∣x (1 − 2 ) a a

−0.5

a−2 = 0 Ô⇒ x = 0.

So the only stationary point of the function f is 0. We can show that it is a maximum point, by computing the second derivative and evaluating it at 0: x2 d [−∣b∣x (1 − 2 ) f (x) = dx a ′′

−0.5

−0.5

d x2 a ] = −a ∣b∣ [x (1 − 2 ) dx a −2

−2

−1.5

−0.5

]

x2 x2 −2x = −a ∣b∣ [(1 − 2 ) − 0.5 (1 − 2 ) ( 2 )] a a a 02 02 −2(0) f ′′ (0) = −a−2 ∣b∣ [(1 − 2 ) −0.5 − 0.5 (1 − 2 ) −1.5 ( 2 )] = −a−2 ∣b∣ < 0. a a a −2

So (0, f (0)) = (0, b) is a maximum turning point of f .

And since g = −f , g ′ (0) = 0 and g ′′ (0) = a−2 ∣b∣ > 0. That is, the only stationary point of g is (0, −b) and it is a minimum point. 3. Asymptotes. By observation, there are no asymptotes.

4. Symmetry. By observation, there are only two lines of symmetry, namely y = 0 and x = 0 (the horizontal and vertical axes). Exercise 75. (Answer on p. 964.) Let a, b, c, d be constants with a, b non-zero. Consider the equation (x + c) (y + d) + = 1. a2 b2 2

2

(i) Sketch its graph. (ii) Write down the points at which it intersects the axes. (iii) Identify any turning points. (iv) Write down the equations of any lines of symmetry and also (v) asymptotes.

Page 178, Table of Contents

www.EconsPhDTutor.com

16.3

The Hyperbola: y = 1/x

y = 1/x (graphed) is the first hyperbola we’ll study. It is also the simplest possible hyperbola.

5

y = -x line of symmetry

y The graph of y = 1 / x has two branches.

4 3

y=x line of symmetry

2 1 x 0 -5

-4

-3

-2

-1

0

1

-1 -2

(0, 0) Centre

2

3

4 y=0 horizontal asymptote

5

-3 -4

x=0 vertical asymptote

-5 It turns out that all hyperbolae we’ll study have some common features. They have two branches. In the case of y = 1/x, one branch is top-right and the other is bottom-right.

Page 179, Table of Contents

www.EconsPhDTutor.com

Moreover, for all hyperbolae we’ll study, 1. Intercepts. Hyperbolae may or may not cross the axes. It depends. y = 1/x is an example of a hyperbola that crosses neither the vertical nor the horizontal axis. (But this is not true of all hyperbolae.)

2. Turning points. Hyperbolae may or may have turning points. It depends.

y = 1/x is an example of a hyperbola that has no turning points. (But this is not true of all hyperbolae.)

3. Asymptotes. Hyperbolae always have two asymptotes. In the case of y = 1/x, they are y = 0 and x = 0.

An interesting feature here is that the two asymptotes are perpendicular. A rectangular hyperbola is any hyperbola whose two asymptotes are perpendicular. And so y = 1/x is an example of a rectangular hyperbola. (But as we’ll see, not all hyperbolae are rectangular.)

4. The centre is the point at which the two asymptotes intersect. In the case of y = 1/x, the centre is (0, 0).

5. Two lines of symmetry. Both pass through the centre. Moreover, each line of symmetry bisects an angle formed by the two asymptotes. In the case of y = 1/x, they are y = x and y = −x.

Page 180, Table of Contents

www.EconsPhDTutor.com

16.4

The Hyperbola x2 − y 2 = 1

x22 − y 22 = 1 is a hyperbola and so it has two distinct branches. Notice also that if x ∈ (−1, 1), then there is no value of y for which x2 − y 2 = 1. Hence, the graph of this equation is empty in the region where x ∈ (−1, 1).

1. Intercepts. The graph crosses the horizontal axis at the points (−1, 0) and (1, 0), but does not intersect the vertical axis. 2. The two turning points — there is a minimum turning point at (0, b) and a maximum turning point at (0, −b). √ √ √ 3. Asymptotes. We have y = ± x2 − 1. So as x → ∞, y = ± x2 − 1 → ± x2 = ±x. (Informally, as x → ∞, the 1 becomes negligible and we can simply ignore it). And so the two asymptotes are y = x and y = −x. The two asymptotes are perpendicular and so this is a rectangular hyperbola. 4. The centre (point at which the two asymptotes intersect) is (0, 0). 5. We know that the two lines of symmetry bisect the angles formed by the asymptotes. So they must have slope 1 and −1. Moreover, both pass through the centre (0, 0). Altogether, we can work out that the lines of symmetry are y = x and y = −x. Page 181, 181, Table Table of of Contents Contents Page

www.EconsPhDTutor.com www.EconsPhDTutor.com

16.5 16.5

x22 yy22 x − 2 == 11 The Hyperbola 2 − The Hyperbola 2 a a bb2

2 2 To to the the equation equation To get get from from the the equation equation x x2 − − yy 2 = = 11 to

involves involves two two simple simple transformations: transformations: 1. 1. 2. 2.

x 22 y 22 x y ( ) ) −( ) =1 ( a − ( bb ) = 1 a

First outwards from from the the vertical vertical axis, axis, by by aa factor factor of of a. a. First stretch stretch the the graph graph horizontally, horizontally, outwards Then outwards from from the the horizontal horizontal axis, axis, by by aa factor factor of of b. b. Then stretch stretch the the graph graph vertically, vertically, outwards

x 2 y 2 The graph’s characteristics are similar to before. Again, ( ) − ( ) = 1 is a hyperbola a b and so it has two distinct branches. Notice also that if x ∈ (−a, a), then there is no value x2 y 2 of y for which 2 − 2 = 1. Hence, the graph of this equation is empty in the region where a b x ∈ (−a, a). Page 182, Table of Contents

www.EconsPhDTutor.com www.EconsPhDTutor.com

1. Intercepts. The graph crosses the horizontal axis at the points (a, 0) and (a, 0), but does not intersect the vertical axis. 2. There are no turning points.

√

2

√

x x 2 3. Asymptotes. We have y = ±∣b∣ ( ) − 1. So as x → ∞, y = ±∣b∣ ( ) − 1 → a a √ 2 x x x x ±∣b∣ ( ) = ±∣b∣ . And so the two asymptotes are y = b and y = −b . The two a a a a asymptotes are perpendicular and so this is a rectangular hyperbola. 4. The centre (point at which the two asymptotes intersect) is (0, 0).

5. We know that the two lines of symmetry bisect the angles formed by the asymptotes. Moreover, both pass through the centre (0, 0). So they must be y = 0 and x = 0. Exam Tip On the A-level exams, they typically only ask for (i) the intercepts; (ii) the asymptotes; and (iii) turning points.

Nonetheless, you might as well know about the centre and the two lines of symmetry, because these concepts are not difficult and will help you to sketch better graphs.

Page 183, Table of Contents

www.EconsPhDTutor.com

16.6 16.6

x22 y22 x y − =1 The Hyperbola 2 − The Hyperbola bb2 aa22 = 1 SYLLABUS ALERT SYLLABUS ALERT

y 22/b22 − x22/a 2 = 1 is explicitly in the 9758 (revised) but not the 9740 (old) syllabus. syllabus. But even if you’re taking 9740, you might as well learn to draw y 22/b22 − x22/a22 == 1, 1, because because 2 2 2 2 it’s really simple (since you now know how to draw x /a − y /b = 1).

y 2 x2 The graph of the equation 2 − 2 = 1 is simply the graph we studied in the previous previous section, section, b a π but rotated clockwise (or anticlockwise). 2

Let’s summarise the graph’s characteristics. This is a hyperbola and so there there are are two two distinct branches. Notice also that if y ∈ (−b, b), then there is no value of xx for for which which 2 2 2 2 y x − = 1. Hence, the graph of this equation is empty in the region where y ∈ (−b, (−b, b). b). The The b22 a22 range of y is thus (−∞, b] ∪ [b, ∞). Page Page 184, 184, Table of Contents

www.EconsPhDTutor.com www.EconsPhDTutor.com

1. Intercepts. The graph crosses the vertical axis at the points (0, −b) and (0, b), but does not intersect the horizontal axis.

2. The two turning points are (0, b) (minimum) and (0, −b) (maximum). √ √ x 2 x 2 3. Asymptotes. We have y = ±∣b∣ 1 + ( ) . So as x → ∞, y = ±∣b∣ 1 + ( ) → a a √ 2 x x x x ±∣b∣ ( ) = ±∣b∣ . And so the two asymptotes are y = b and y = −b . The two a a a a asymptotes are perpendicular and so this is a rectangular hyperbola. 4. The centre (point at which the two asymptotes intersect) is (0, 0).

5. We know that the two lines of symmetry bisect the angles formed by the asymptotes. Moreover, both pass through the centre (0, 0). So they must be y = 0 and x = 0.

y 2 x2 If we’d like, we can also find the turning points of 2 − 2 = 1 more rigorously, that is, b a through calculus. As with the circle, although it is not possible to rewrite this equation into the form of a single function, it is possible to rewrite √ it into the form of two functions. x 2 Namely, f ∶ (−∞, −a] ∪ [a, ∞) → R defined by x ↦ ∣b∣ ( ) + 1 and g ∶ (−∞, −a) ∪ (a, ∞) → a √ 2 x R defined by x ↦ −∣b∣ ( ) + 1. The graph of the function f is entirely above the horia zontal axis, while that of g is entirely below the horizontal axis. Let’s compute the first derivative of f :

x 2 ′ f (x) = 0.5∣b∣ [( ) + 1] a

−0.5

(

2x ∣b∣ x )= √ . 2 a a x2 + a2

Hence, the only stationary point of f is (0, b). Let’s check what sort of a stationary point this is. ∣b∣ f ′′ (x) = a

And so f ′′ (0) =

√ −0.5 x2 + a2 − x(0.5) (x2 + a2 ) (2x) . x2 + a2

∣b∣ > 0. Hence, this is a minimum point. a2

Similarly, by computing the first derivative of g and doing the work, we can find that the only stationary point of g is (0, −b) and that this is a maximum point.

Page 185, Table of Contents

www.EconsPhDTutor.com

16.7

Long Division of Polynomials

Remember long division? Turns out it’ll be useful for dividing polynomials. Here are a couple primary school examples to jog your memory. Example 139. What’s 83 ÷ 7? By long division, the quotient is 11 with a remainder of 6. So, 83 ÷ 7 = 116/7. 11 7 83 77 6

The quotient is the integer portion of the solution and the remainder is the “left-over” integer. Example 140. What’s 470 ÷ 17? By long division, the quotient is 27 with a remainder of 11. So, 470 ÷ 17 = 2711/17. 27 17 470 459 11

Page 186, Table of Contents

www.EconsPhDTutor.com

Long division can be used to divide one polynomial by another. But first of all, in case you don’t remember what a polynomial is... Definition 47. An nth-degree polynomial in one variable is any expression a0 xn + a1 xn−1 + a2 xn−2 + ⋅ ⋅ ⋅ + an−1 x + an where each ai is a constant and x is the variable. In this textbook, we’ll almost always consider only polynomials in one variable. So when I say polynomial, I’ll always mean a polynomial in one variable, unless otherwise stated.30

Example 141. The expressions 7x − 3 and 4x + 2 are 1st-degree polynomials (in one variable). These are also called linear polynomials. (Polynomials of low degree are often also called by such special names.) Example 142. The expressions 3x2 + 4x − 5 and −x2 + 2x + 1 are 2nd-degree polynomials. These are also called quadratic polynomials. Example 143. The expressions 2x3 + 2x2 + 3x − 1 and −3x3 + 2x2 + 3x + 1 3rd-degree polynomials. These are also called cubic polynomials. Example 144. The expressions 5x4 − 2x3 + 2x2 + 3x − 1 and −9x4 + 3x3 + 2x2 + 3x + 1 are 4th-degree polynomials. These are also called quartic polynomials.

30

Actually, we’ve already secretly studied an example of a polynomial in two variables — the expression on the LHS of the equation of the conic section: Ax2 + Bxy + Cy 2 + Dx + Ey + F = 0.

Page 187, Table of Contents

www.EconsPhDTutor.com

Now let’s do some polynomial division. x2 + 3 Example 145. Say you have an expression . We might be perfectly content with x−1 this expression. Or we might try to simplify it through long division: x +1 x − 1 x2 +0x +3 x2 −x +0 x +3 x −1 4

The “quotient” is x + 1 and the “remainder” is 4. Hence, x2 + 3 4 =x+1+ . x−1 x−1

Example 146. Let’s simplify

4x3 + 2x2 + 1 through long division: 2x2 − x − 1

2x +2 3 2x − x − 1 4x +2x2 4x3 −2x2 4x2 4x2 2

+0x −2x +2x −2x 4x

+1 +0 +1 +3 +3

The “quotient” is 2x + 2 and the “remainder” is 4x + 3. Hence, 4x3 + 2x2 + 1 4x + 3 = 2x + 2 + 2 . 2 2x − x − 1 2x − x − 1

Exercise 76. Use long division to simplify each of the following. (Answer on p. 965.) 16x + 3 (a) . 5x − 2 Page 188, Table of Contents

4x2 − 3x + 1 (b) . x+5

x2 + x + 3 (c) . −x2 − 2x + 1 www.EconsPhDTutor.com

16.8

The Hyperbola y =

In the next section, we’ll study this equation:

bx + c dx + e

ax2 + bx + c . y= dx + e

To warm up, here we’ll study the special case of the above equation, where a = 0: y=

Page 189, Table of Contents

bx + c . dx + e

www.EconsPhDTutor.com

Example 147. Graphed below is the equation y = (2x + 1)/(x + 1). This is the case where b = 2, c = 1, d = 1, and e = 1. Do the long division: 2 x + 1 2x +1 2x +2 −1

Ô⇒

𝑦 = −𝑥 + 1 line of symmetry

y=

2x + 1 1 =2− . x+1 x+1

7

𝑦

𝑦 = 𝑥 + 3 line of symmetry

5

𝑦=

2𝑥 + 1 𝑥+1

3

(-1, 2) Centre

-6

-4

𝑦 = 2 horizontal asymptote

1

𝑥 -2 𝑥 = −1 vertical asymptote

-1

0

2

4

-3

As usual, this is a hyperbola with two distinct branches. Other features: 1. Intercepts. The graph intersects the vertical axis at the point (0, 1) and the horizontal axis at the point (−0.5, 0). 2. There are no turning points.

3. Asymptotes. As x → −1, y → ±∞. And so x = −1 is a vertical asymptote. As x → ±∞, y → 2. And so y = 2 is a horizontal asymptote. The two asymptotes are perpendicular and so this is a rectangular hyperbola.

4. The centre (point at which the two asymptotes intersect) is (−1, 2). 5. We know that the two lines of symmetry bisect the angles formed by the asymptotes. So they must have slope 1 and −1. Moreover, both pass through the centre (−1, 2). Altogether, we can work out that the lines of symmetry are y = x + 3 and y = −x + 1.

Page 190, Table of Contents

www.EconsPhDTutor.com

Example 148. Graphed below is the equation y = (7x + 3)/(2x + 4). This is the case where Example 148. Graphed below is the equation y = (7x + 3)/(2x + 4). This is the case where bb == 7, 7, cc == 3, 3, dd == 2, 2, and and ee == 4. 4. Do Do the the long long division: division: 3.5 2x + 4 7x

+3

7x +14

−11

Ô⇒

y=

11 7x + 3 . = 3.5 − 2x + 4 2x + 4

Let’s summarise the graph’s characteristics. This is a hyperbola and so there are two distinct branches.

1. Intercepts. The graph intersects the vertical axis at the point (0, 0.75) and the horizontal axis at the point (−3/7, 0). 2. There are no turning points. 3. Asymptotes. As x → −2, y → ±∞. And so x = −2 is a vertical asymptote. As x → ±∞, y → 3.5. And so y = 3.5 is a horizontal asymptote. The two asymptotes are perpendicular and so this is a rectangular hyperbola. 4. The centre (point at which the two asymptotes intersect) is (−2, 3.5). 5. We know that the two lines of symmetry bisect the angles formed by the asymptotes. So they must have slope 1 and −1. Moreover, both pass through the centre (−2, 3.5). Altogether, we can work out that the lines of symmetry are y = x + 5.5 and y = −x + 1.5. Page Page 191, 191, Table Table of of Contents Contents

www.EconsPhDTutor.com www.EconsPhDTutor.com

Let’s now look, more generally, at the equation y = b/d dx + e bx bx

bx + c . By long division, we have: dx + e

+c +be/d c − be/d

The “quotient” is b/d and the “remainder” is c − be/d. Let’s further simplify this so that x has no coefficient. bx + c b c − be/d = + dx + e d dx + e = =

b c − be/d 1 + d d x + e/d b cd − be 1 + d d2 x + e/d

We can thus get from y = 1/x to the above equation, through these transformations: 1. Shift the graph leftwards by

1 e units to get the graph of y = . d x + e/d

2. Stretch the graph vertically, outwards from the horizontal axis, by a factor of cd − be get the graph of y = 2 . d (x + e/d) 3. Finally, shift the graph upwards by b/d units to get the final graph.

cd − be to d2

Exam Tip The A-level exams often ask you to list down a series of transformations that will get you from one graph to another, as was just done.

Page 192, Table of Contents

www.EconsPhDTutor.com

bx + c Let’s now summarise the characteristics of the graph of the equation y = . This is dx + e a hyperbola with two distinct branches. 1. Intercepts. If e = 0, then the graph does not cross the vertical axis. If e ≠ 0, then the graph intersects the vertical axis at the point (0, c/e). If b = 0, then the graph does not cross the horizontal axis. If b ≠ 0, then the graph intersects the vertical axis at the point (−c/b, 0). 2. There are no turning points.31 3. Asymptotes. As x → −e/d, y → ±∞. And so x = −e/d is a vertical asymptote. As x → ±∞, y → b/d. And so y = b/d is a horizontal asymptote. The two asymptotes are perpendicular and so this is a rectangular hyperbola.

4. The centre (point at which the two asymptotes intersect) is (−e/d, b/d). 5. We know that the two lines of symmetry bisect the angles formed by the asymptotes. So they must have slope 1 and −1. Moreover, both pass through the centre (−e/d, b/d). Altogether, we can work out that the lines of symmetry are y = x + e/d + b/d and y = −x − e/d + b/d. Exercise 77. For each of the following equations, sketch its graph and identify its intercepts, turning points, asymptotes, centre, and lines of symmetry (if there are any of these). (Answers on pp. 966, 967, and 968.) (a) y =

31

3x + 2 . x+2

(b) y =

x−2 . −2x + 1

See p. 852 in the Appendices (optional) for a proof that y =

Page 193, Table of Contents

(c) y =

−3x + 1 . 2x + 3

bx + c has no turning points. dx + e

www.EconsPhDTutor.com

16.9

ax2 + bx + c The Hyperbola y = dx + e

We now study the more general equation

We’ll rule out the following cases. • a = 0, because in that case

y=

ax2 + bx + c . dx + e

ax2 + bx + c bx + c = , dx + e dx + e

and this was already studied in the last section. • d = 0, because in that case

ax2 + bx + c ax2 + bx + c a 2 b c = = x + x+ , dx + e e e e e

which is a quadratic and which we already studied in secondary school. • Both c and e are 0, because in that case

which is a linear expression.

ax2 + bx a b = x+ , dx d d

We’ll start with the simplest possible case (a = 1, b = 0, c = 1, d = 1, and e = 0). This is the equation x2 + 1 y= . x

Page 194, Table of Contents

www.EconsPhDTutor.com

Example 149. Graphed below is the equation y = (x 2 + 1) /x.

Do the long division: Do the long division:

x x x x22 +1 x x +1 x22 x 1 1

Ô⇒ Ô⇒

1 x2 + 1 y = x2 + 1 = x + 1 . y = x = x+ x. x x

As usual, usual, this this is is aa hyperbola hyperbola that that has As has two two distinct distinct branches. branches. Other Other features: features: 1. Intercepts. Intercepts. The The graph graph intersects intersects neither 1. neither the the vertical vertical axis axis nor nor the the horizontal horizontal axis. axis. 2. There There are are two two turning turning points points — 2. — (−1, (−1, −2) −2) is is aa maximum maximum turning turning point point and and (1, (1, 2) 2) is is 2aa minimum turning turning point. point. (To (To find find these, minimum these, compute compute the the first first derivative derivative dy/dx dy/dx == 11 − − 1/x 1/x2.. Set these these equal equal to to 00 for for find find two two stationary Set stationary points: points: x x == ±1. ±1. Use Use the the 2DT 2DT to to determine determine that x = −1 and x = 1 are, respectively maximum and minimum turning points.) that x = −1 and x = 1 are, respectively maximum and minimum turning points.)

By observation, observation, yy can can take take on on any any value By value except except those those between between these these two two turning turning points. points. The range of y is thus (−∞, −2] ∪ [2, ∞). The range of y is thus (−∞, −2] ∪ [2, ∞). 3. Asymptotes. Asymptotes. As As x x→ → 0, 0, yy → → ±∞. 3. ±∞. Hence, Hence, there there is is one one vertical vertical asymptote: asymptote: xx == 0. 0. As As x → ±∞, y → x. Hence, there is one oblique asymptote: y = x. The two asymptotes are x → ±∞, y → x. Hence, there is one oblique asymptote: y = x. The two asymptotes are not perpendicular perpendicular and and so so this this is not is not not aa rectangular rectangular hyperbola. hyperbola. 4. The The centre centre (point (point at at which which the the two 4. two asymptotes asymptotes intersect) intersect) is is (0, (0, 0). 0). 5. We know that the two lines of symmetry bisect the angles formed 5. We know that the two lines of symmetry bisect the angles formed by by the the asymptotes asymptotes and pass through the centre. You don’t need to learn how to figure and pass through the centre. You don’t need to learn how to figure out out their their equations (but see pp. 853ff. in the Appendices if you’re interested). equations (but see pp. 853ff. in the Appendices if you’re interested). Page 195, Table of Contents

www.EconsPhDTutor.com www.EconsPhDTutor.com

x22 + + 3x 3x + + 11 x Do the the long long division: division: Example 150. Graphed below is the equation y = .. Do Example 150. Graphed below is the equation y = x + 1 x+1 x +2 x + 1 x2 +3x +1 x2 +x 2x

2x

+2 −1

Ô⇒

y=

x2 + 3x + 1 1 =x+2− . x+1 x+1

As usual, this is a hyperbola that has two distinct branches. Other features: As

1. Intercepts. The graph intersects the vertical axis at the point (0, 1) and the horizontal 1. √ √ axis at the points (0.5(−3 + 5), 0) and (0.5(−3 − 5), 0). (The horizontal intercepts 2 are simply the zeros of the quadratic x + 3x + 1.) 2. There are no turning points. (Compute dy/dx = 1 + 1/(x + 1)2 . Set this equal to 0 — 2. there are no stationary points and thus no turning points either.) By observation, y can take on any value. The range of y is thus R. By

3. Asymptotes. As x → −1, y → ±∞. Hence, there is one vertical asymptote: x = −1. As 3. x → ±∞, y → x+2. Hence, there is one oblique asymptote: y = x+2. The two asymptotes are not perpendicular and so this is not a rectangular hyperbola. 4. The centre (point at which the two asymptotes intersect) is (−1, 1). 4. 5. We know that the two lines of symmetry bisect the angles formed by the asymptotes 5. and pass through the centre. Again, you don’t need to know how to find their equations.

Page Page 196, 196, Table Table of of Contents Contents

www.EconsPhDTutor.com www.EconsPhDTutor.com

2x22 + + 2x 2x + + 11 2x Do the the long long division: division: Example 151. Graphed below is the equation y = Example 151. Graphed below is the equation y = −x + 1 .. Do −x + 1 −2x −x + 1 2x2 2x2

−4 +2x +1 −2x 4x

4x

−4 5

Ô⇒

5 5 2x2 + 2x + 1 = −2x − 4 + = −2x − 4 + . −x + 1 −x + 1 −x + 1

As usual, this is a hyperbola that has two distinct branches. Other features:

1. Intercepts. The graph intersects the vertical axis at the point (0, 1), but not the horizontal axis, because there are no real zeros for the quadratic 2x2 + 2x + 1. √ √ 2. There are two turning points — (1 − 2.5, 0.325) and (1 + 2.5, −12.325) are the minimum and maximum turning points. (Verify this.)

By observation, y can take on any value except those between these two turning points. The range of y is thus (−∞, −12.325] ∪ [0.325, ∞).

3. Asymptotes. As x → 1, y → ±∞. Hence, there is one vertical asymptote: x = 1. As x → ±∞, y → −2x − 4. Hence, there is one oblique asymptote: y = −2x − 4. The two asymptotes are not perpendicular and so this is not a rectangular hyperbola. 4. The centre (point at which the two asymptotes intersect) is (1, −6). 5. We know that the two lines of symmetry bisect the angles formed by the asymptotes and pass through the centre. Again, you don’t need to know how to find their equations. Page Page 197, 197, Table Table of of Contents Contents

www.EconsPhDTutor.com www.EconsPhDTutor.com

Because it gets rather messy, we will not look more generally at the equation ax2 + bx + c y= , dx + e

But you can read more about it in the Appendices (p. 853 onwards). Exercise 78. For each of the following equations, sketch its graph and identify its intercepts, turning points, asymptotes, centre, and lines of symmetry (if any of these exist). (Answers on pp. 969, 971, and 973.) (a) y =

x2 + 2x + 1 . x−4

Page 198, Table of Contents

(b) y =

−x2 + x − 1 . x+1

(c) y =

2x2 − 2x − 1 . x+4

www.EconsPhDTutor.com

17

Simple Parametric Equations

A graph (or curve) is simply a set of points. Parametric equations give us an alternative method to describing the same graph (or curve). Example 152. Recall that the graph of the equation x2 + y 2 = 1 — i.e. the set S = {(x, y) ∶ x2 + y 2 = 1} — is the unit circle centred on the origin.

Arrows indicate the 𝑦 instantaneous direction of travel.

t = 0, x = 1, yArrows =0 indicate the -1 -1 vx = 0 ms , vy = 1 ms instantaneous ax = -1 ms-2, ay = direction 0 ms-2 of travel.

𝑡 = 3𝜋/4, 𝑥 = − 2/2, 𝑦 = 2/2 −1 𝑣𝑥 = − 2/2 ms ms −1 t = 3,π𝑣/𝑦2,=x − = 0,2/2 y = -1 -1 −2 𝑎𝑥 = 2/2 msv−2 𝑎𝑦ms=-1,−vy =2/2 ms 0 ms x =, 1 ax = 0 ms-2, ay = 1 ms-2

𝑥 2 + 𝑦2 = 1

𝑡 = 0, 𝑥 = 1, 𝑦 = 0 𝑣𝑥 = 0 ms −1 , 𝑣𝑦 = 1 ms −1 𝑎𝑥 = −1 ms−2 , 𝑎𝑦 = 0 ms −2

𝑥

𝑡 = 3𝜋/2, 𝑥 = 0, 𝑦 = −1 𝑣𝑥 = 1 ms −1 , 𝑣𝑦 = 0 ms −1 𝑎𝑥 = 0 ms−2 , 𝑎𝑦 = 1 ms−2

Observe that if x = cos t and y = sin t, then by a trigonometric identity, x2 + y 2 = 1. As it turns out, this gives us a second way of writing the set S: S = {(x, y) ∶ x = cos t, y = sin t, t ∈ R}.

The variable t is called a parameter, hence the name parametric equations. As t increases from 0 to 2π, we trace out, anti-clockwise, a unit circle centred on the origin. t = 0 Ô⇒ (x, y) = (1, 0), √ √ t = π/4 Ô⇒ (x, y) = ( 2/2, 2/2) ,

t = 2π/4 Ô⇒ (x, y) = (0, 1), √ √ t = 3π/4 Ô⇒ (x, y) = (− 2/2, 2/2) , Page 199, Table of Contents

t = 4π/4 Ô⇒ (x, y) = (−1, 0), √ √ t = 5π/4 Ô⇒ (x, y) = (− 2/2, − 2/2) ,

t = 6π/4 Ô⇒ (x, y) = (0, −1), √ √ t = 7π/4 Ô⇒ (x, y) = ( 2/2, − 2/2) .

www.EconsPhDTutor.com

A nice interpretation is of the parameter t as time. Example 199 (continued from above). The set S = {(x, y) ∶ x = cos t, y = sin t, 0 ≤ t < 2π} can also be interpreted as tracing the motion of a particle as it moves anti-clockwise around a circle. x and y give the distances of the particle (in metres) from the origin, in the x- and y-directions. We have x = cos t and y = sin t. This says that at any instant of time t, the particle is cos t metres to the east of the origin and sin t metres to the north of the origin. (Note that if cos t < 0, then the particle is to the west of the origin. And if sin t < 0, then the particle is to the south of the origin.)

At time t = 0 s, the particle is at the position (x, y) = (1, 0). At time t = 1 s, the particle has moved to the position (x, y) = (0.54, 0.84). At time t = π/2 ≈ 1.07 s, the particle has moved to position (x, y) = (0, 1).

Having interpreted t as time, we can now also easily talk about the velocity and acceleration of the particle at different instants in time. Example 199 (continued from above). We have x = cos t and y = sin t. From this, we can easily compute the particle’s velocity in each direction: vx = dx/dt = − sin t and vy = dy/dt = cos t.

This says that at any instant of time t, the velocity of the particle is − sin t ms-1 in the x-direction and cos t ms-1 in the y-direction. (Note that if − sin t < 0, then the particle is moving westwards. And if cos t < 0, then the particle is moving southwards.) √ So for example, at time t = 7π/4, its velocity is − sin (7π/4) = 2/2 ms-1 rightwards and √ cos (7π/4) = 2/2 ms−1 upwards.

Similarly, we can compute the particle’s acceleration in each direction: ax = d2 x/dt2 = − cos t and ay = d2 y/dt2 = − sin t. √ So for example,√at time t = 7π/4, its acceleration is − cos (7π/4) = − 2/2 ms-1 rightwards and − cos (7π/4) = 2/2 ms−1 upwards. That is, the particle is travelling rightwards (because its velocity rightwards at this instant in time is positive); however, its rightwards velocity is slowing down. Exercise 79. (Answer on p. 975.) Let P be the particle whose position (in metres) is described by the set {(x, y) ∶ x = cos t, y = sin t, t ∈ R}, where t is time (seconds). Let Q be the particle whose position (in metres) is described by the set {(x, y) ∶ x = sin t, y = cos t, t ∈ R}. (a) How does the starting point (when t = 0) of Q differ from that of P ? (b) What about the direction of travel? Page 200, Table of Contents

www.EconsPhDTutor.com

Example 153. Recall that the graph of the equation x2 /a2 + y 2 /b2 = 1 — i.e. the set T = {(x, y) ∶ x2 /a2 + y 2 /b2 = 1} — is the ellipse centred on the origin, with horizontal intercepts ±a and vertical intercepts ±b.

y

t = 3π / 4

Arrows indicate the instantaneous direction of travel.

x t = 0, x = 1, y = 0

t = 3π / 2

Observe that if x = a cos t and y = b sin t, then by the same trigonometric identity as before, x2 /a2 + y 2 /b2 = 1. As it turns out, this gives us a second way of writing the set T : T = {(x, y) ∶ x = a cos t, y = b sin t, t ∈ R} .

Similar to before, as t increases from 0 to 2π, we trace out, anti-clockwise, an ellipse centred on the origin. At any instant in time t, the particle’s position, velocity, and acceleration are (x, y) = (a cos t, b sin t), (vx , vy ) = (−a sin t, b cos t), and (ax , ay ) = (−a cos t, −b sin t). Exercise 80. Let P be the particle whose position (in metres) is described {(x, y) ∶ x = a cos t, y = b sin t, t ∈ R}, where t is time (seconds). At each of the following times, state the particle’s position and also its velocity and acceleration in both the xπ π and y- directions. (a) t = ; (b) t = ; (c) t = 2π. (Answer on p. 975.) 4 2

Page 201, Table of Contents

www.EconsPhDTutor.com

Example 154. Recall that the graph of the equation x2 − y 2 = 1 — i.e. the set U = {(x, y) ∶ x2 − y 2 = 1} — is the rectangular “east-west” hyperbola centred on the origin, with horizontal intercepts ±1 and no vertical intercepts.

Arrows indicate 5 y the instantaneous 4 direction of travel. 3 x2 - y2 = 1 2 1 t=4 0 t=3 -5 -4 -3 -2 -1 -1 0 -2 t=2 -3 -4 -5

t=1 x

t=0 1

2

3

4

5

t=5

Observe that if x = sec t and y = tan t, then by a trigonometric identity, x2 − y 2 = 1. As it turns out, this gives us a second way of writing the set U : U = {(x, y) ∶ x = sec t, y = tan t, t ∈ R, t ≠ kπ/2} .

Note that t cannot be a half-integer multiple of π, because then tan t would be undefined. Again, let’s interpret this as the movement of a particle. Interestingly, the particle always moves upwards, as we can easily prove — vy = dy/dt = sec2 t > 0 for all t.

At t = 0, the particle is at (x, y) = (1, 0). During t ∈ [0, π/2), the particle moves northeast along the green segment and flies off towards infinity as t → π/2 ≈ 1.57. An instant after π/2 seconds, the particle magically reappears “near” infinity in the southwest. During t ∈ (π/2, π], the particle moves northeast along the blue segment. At t = π, the particle is at (−1, 0).

During t ∈ [π, 3π/2), the particle moves northwest along the red segment and flies off towards infinity as t → 3π/2 ≈ 4.71.

An instant after 3π/2 seconds, the particle magically reappears “near” infinity in the southeast. During t ∈ (3π/2, 2π], the particle moves northwest along the pink segment.

Page 202, Table of Contents

www.EconsPhDTutor.com

Exercise 81. (Answer on p. 976.) Suppose that the position of a particle is described by the set {(x, y) ∶ x = tan t, y = sec t, t ∈ R}, where t is time, measured in seconds. (a) Rewrite the set using a single cartesian equation.

(b) Compute dx/dt. And hence make an observation about how the particle travels in the x-direction. The graph below indicates six positions of the particle — A, B, C, D, E, and F . (Also indicated are the directions of travel.) The particle is at these positions at times t = 0, 1, 2, 3, 4, and 5 but not necessarily in that order. (c) Using only the graphs of s = tan t and s = sec t (above) to guide you and without using a calculator, state where the particle is, at each of the times t = 0, 1, 2, 3, 4, and 5.

5

y

4

A

3 C

2 B 1 {(x, y): x = tan t, y = sec t, t -5

-4

-3

-2

0

-1

x

} 0

1

2

3

4

5

-1 E -2 D

F

-3

Arrows indicate -4 the instantaneous direction of travel. -5

Page 203, Table of Contents

www.EconsPhDTutor.com

17.1

Eliminating the Parameter t

Given a pair of parametric equations that describes a set of points, we can often go in reverse: We can eliminate the parameter t and describe the same set of points using a single equation. Example 155. The set {(x, y) ∶ x = t2 + t, y = t − 1, t ∈ R} describes the position (metres) of a particle at time t (seconds).

y Instantaneous Direction of Travel

Instantaneous Direction of Travel

Instantaneous Direction of Travel

x

t = 1, x = 2, y = 0 vx = (2t + 1) ms-1 = 3 ms-1 vy = 1 ms-1, ax = 2 ms-2, ay = 0 ms-2 t = 0, x = 0, y = - 1 vx = (2t + 1) ms-1 = 1 ms-1 vy = 1 ms-1, ax = 2 ms-2, ay = 0 ms-2 t = - 1, x = 0, y = - 2 vx = (2t + 1) ms-1 = - 1 ms-1 vy = 1 ms-1, ax = 2 ms-2, ay = 0 ms-2

x = y2 + 3y + 2

Write t = y + 1 and x = (y + 1)2 + (y + 1) = y 2 + 3y + 2. And so the same set can be rewritten as: {(x, y) ∶ x = y 2 + 3y + 2}. As an exercise, let’s also compute the velocity and acceleration of the particle.

vx = dx/dt = 2t + 1 and vy = dy/dt = 1. This says that at any instant in time t, the particle has velocity 2t + 1 ms−1 rightwards and 1 ms−1 upwards.

ax = d2 x/dt2 = 2 and ay = d2 y/dt2 = 0. This says that the particle is always accelerating rightwards at the rate 2 ms−2 . Moreover, it is never accelerating upwards (this is consistent with the above finding that its upwards velocity is a constant 1 ms−1 ).

Page 204, Table of Contents

www.EconsPhDTutor.com

Example 156. The set {(x, y) ∶ x = 2 cos t − 4, y = 3 sin t + 1, t ∈ R} describes the position (metres) of a particle at time t (seconds).

5 t = π / 2, x = - 4, y = 4 vx = - 2 sin (t) ms-1 = - 2 ms-1 vy = 3 cos (t) ms-1 = 0 ms-1 ax = - 2 cos (t) ms-2 = 0 ms-2 ay = - 3 sin (t) ms-2 = -3 ms-2 t = π, x = - 6, y = 1 vx = - 2 sin (t) ms-1 = 0 ms-1 vy = 3 cos (t) ms-1 = - 3 ms-1 ax = - 2 cos (t) ms-2 = 2 ms-2 ay = - 3 sin (t) ms-2 = 0 ms-2

y

4 3 2 1 x 0

-7

-5

-3

-1

1 -1

t = 3π / 2 , x = - 4, y = - 2 vx = - 2 sin (t) ms-1 = 2 ms-1 -2 vy = 3 cos (t) ms-1 = 0 ms-1 ax = - 2 cos (t) ms-2 = 0 ms-2 ay = - 3 sin (t) ms-2 = 3 ms-2 -3 Write (x + 4) /2 = cos t and (y − 1) /3 = sin t. Using the trigonometric identity cos2 t+sin2 t = 2 2 1, we can rewrite the set as {(x, y) ∶ x = [(x + 4) /2] + [(y − 1) /3] = 1}. This is the ellipse centred on (−4, 1). As an exercise, let’s also compute the velocity and acceleration of the particle.

vx = dx/dt = −2 sin t and vy = dy/dt = 3 cos t. This says that at any instant in time t, the particle has velocity 2 sin t ms−1 leftwards and 3 cos t ms−1 upwards.

ax = d2 x/dt2 = −2 cos t and ay = d2 y/dt2 = −3 sin t. This says that at any instant in time t, the particle is accelerating leftwards at the rate −2 cos t ms−2 and upwards at the rate −3 sin t ms−2 .

Page 205, Table of Contents

www.EconsPhDTutor.com

Exercise 82. Each of the following sets describes the position (metres) of a particle at time t (seconds). Rewrite each set into a form where the parameter t is eliminated. Sketch the graph of each. Indicate the particle’s position and direction of travel at t = 0. (Answers on pp. 977, 978, and 979.) (a) {(x, y) ∶ x = 2 sin t − 1, y = 3 cos2 t, t ∈ R}.

(b) {(x, y) ∶ x =

1 , y = t2 + 1, t ∈ R}. t−1

(c) {(x, y) ∶ x = t − 1, y = ln(2t + 1), t > −0.5}.

Page 206, Table of Contents

www.EconsPhDTutor.com

18

Equations and Inequalities

N N Given any fraction (where N and D are real numbers with D non-zero), we have >0 D D if and only if one of the following is true: 1. “N > 0 AND D > 0”; OR

2. “N < 0 AND D < 0”.

The expressions that are in the numerator (N ) and denominator (D) can get pretty complicated. So here are some very simple examples just to warm you up.

Example 157.

Example 158.

Example 159.

Example 160.

4 > 0 because both the numerator and denominator are positive. 7 −5 > 0 because both the numerator and denominator are negative. −3

−9 < 0 because the numerator is negative but the denominator is positive. 2 1 > 0 because the numerator is positive but the denominator is negative. −8

Page 207, Table of Contents

www.EconsPhDTutor.com

18.1

Example 161.

ax + b >0 cx + d

x+3 > 0 ⇐⇒ one of the following is true: 3x + 2

1. “x + 3 > 0 AND 3x + 2 > 0”; OR

2. “x + 3 < 0 AND 3x + 2 < 0”.

Notice that (1) “x + 3 > 0 AND 3x + 2 > 0” ⇐⇒ “x > −3 AND x > −2/3” , which in turn is equivalent to the single inequality “x > −2/3”.

Notice that (1) “x + 3 < 0 AND 3x + 2 < 0” ⇐⇒ “x < −3 AND x < −2/3” , which in turn is equivalent to the single inequality “x < −3”.

x+3 > 0 ⇐⇒ “x > −2/3 OR x < −3” (equivalently, “x ∈ (−∞, −3) ∪ Altogether then, 3x + 2 2 (− , ∞)”). 3 Note that I use quotation marks “⋅”, but these are not necessary. Instead, they merely help to make especially clear which groups of conditions corresponds to each other. Example 162.

4x − 1 > 0 ⇐⇒ one of the following is true: x+2

1. “4x − 1 > 0 AND x + 2 > 0” ⇐⇒ “x > 1/4 AND x > −2” ⇐⇒ “x > 1/4” ; OR 2. “4x − 1 < 0 AND x + 2 < 0” ⇐⇒ “x < 1/4 AND x < −2” ⇐⇒ “x < −2”. Altogether then,

(1/4, ∞)”).

Example 163.

4x − 1 > 0 ⇐⇒ “x > 1/4 and x < −2” (equivalently, “x ∈ (−∞, −2) ∪ x+2

5x + 4 > 0 ⇐⇒ one of the following is true: −2x + 1

1. “5x + 4 > 0 AND −2x + 1 > 0” ⇐⇒ “x > −4/5 AND x < 1/2” ⇐⇒ “x ∈ (−4/5, 1/2)” ; OR

2. “5x + 4 < 0 AND −2x + 1 < 0” ⇐⇒ “x < −4/5 AND x > 1/2”, but these are mutually contradictory and thus impossible.

Altogether then,

5x + 4 > 0 ⇐⇒ “x ∈ (−4/5, 1/2)”. −2x + 1

When given any inequality that is of a slightly different form, be sure to always convert it N into what I’ll call the standard form > 0. Strictly speaking, this is not necessary, but D Page 208, Table of Contents

www.EconsPhDTutor.com

if you always do this, you’ll make a habit of solving inequalities in this form, and thus be less likely to make a careless mistake.

Example 164. Consider the inequality 3−

3x − 2 >0 −5x + 1

⇐⇒

3x − 2 < 3. This inequality is equivalent to −5x + 1

−15x + 3 − (3x − 2) >0 −5x + 1

⇐⇒

−18x + 5 > 0. −5x + 1

This last inequality is in turn true ⇐⇒ one of the following is true:

1. “−18x + 5 > 0 AND −5x + 1 > 0” ⇐⇒ “x < −5/18 AND x < 1/5” ⇐⇒ “x < −5/18” ; OR 2. “−18x + 5 < 0 AND −5x + 1 < 0” ⇐⇒ “x > −5/18 AND x > 1/5” ⇐⇒ “x > 1/5”. Altogether then,

(1/5, ∞)”).

3x − 2 < 3 ⇐⇒ “x < −5/18 OR x > 1/5” (equivalently, “x ∈ (−∞, −5/18) ∪ −5x + 1

2x + 1 Exercise 83. For what values of x is each of the following inequalities true? (a) > 0. 3x + 2 x−1 −1 1 −3x − 18 2x + 3 (b) > 0. (c) > 0. (d) > 0. (e) > 0. (f) < 9. (Answers on p. −4 −4 −4 9x − 14 −x + 7 980.)

Page 209, Table of Contents

www.EconsPhDTutor.com

18.2

ax2 + bx + c >0 dx2 + ex + f

ax2 + bx + c . But dx2 + ex + f ax2 + bx + c > 0. This is you are required to know how to find the values of x for which dx2 + ex + f just the same game as before, albeit slightly more complicated.

Don’t worry, you are not required to know how to graph the equation y =

2x2 + x + 3 > 0 ⇐⇒ one of the following is true: −x2 + 3x + 2

Example 165.

1. “2x2 + x + 3 > 0 AND −x2 + 3x + 2 > 0”; OR 2. “2x2 + x + 3 < 0 AND −x2 + 3x + 2 < 0”.

y = 2x2 + x + 3 is a ∪-shaped quadratic and has no real roots (because the discriminant is negative). Hence, it is always positive. It is thus impossible that “2x2 + x + 3 < 0 AND −x2 + 3x + 2 < 0” (Case 2).

We need thus only examine Case 1. As we just said, it is always true that 2x2 + x + 3 > 0. So we need only examine when it is true that −x2 + 3x + 2 > 0. The equation y = −x2 + 3x + 2 has a ∩-shaped graph and has two real zeros given by: −3 ±

√

√ √ √ 32 − 4(−1)(2) −3 ± 17 3 ∓ 17 = = = 0.5 (3 ∓ 17) . 2(−1) −2 2

Hence, the expression −x2 + 3x + 2 > 0 ⇐⇒ “x ∈ (0.5 (3 −

Altogether then,

√

17) , 0.5 (3 +

√

17))”.

√ √ 2x2 + x + 3 17) , 0.5 (3 + 17))”. > 0 ⇐⇒ “x ∈ (0.5 (3 − −x2 + 3x + 2

A dirty trick is to use your TI84 to do a quick check that this answer is correct:

Page 210, Table of Contents

www.EconsPhDTutor.com

−x2 + 4x − 1 > 0 ⇐⇒ one of the following is true: Example 166. 2x2 + x + 2 1. “−x2 + 4x − 1 > 0 AND 2x2 + x + 2 > 0”; OR 2. “−x2 + 4x − 1 < 0 AND 2x2 + x + 2 < 0”.

The equation y = 2x2 + x + 2 has a ∪-shaped graph and has no real zeros (because the discriminant is negative). Hence, it is always positive. It is thus impossible that “−x2 + 4x − 1 < 0 AND 2x2 + x + 2 < 0” (Case 2).

We need thus only examine Case 1. As we just said, it is always true that y = 2x2 +x+2 > 0. So we need only examine when it is true that −x2 + 4x − 1 > 0. The equation y = −x2 + 4x − 1 has a ∩-shaped graph and has two real zeros given by: −4 ±

√

√ √ √ 42 − 4(−1)(−1) −4 ± 12 4 ∓ 12 = = = 2 ∓ 3. 2(−1) −2 2

Hence, the expression −x2 + 4x − 1 > 0 ⇐⇒ “x ∈ (2 −

√

3, 2 +

√ 3)”.

√ √ −x2 + 4x − 1 Thus, > 0 ⇐⇒ x ∈ (2 − 3, 2 + 3) ≈ (0.268, 3.732). As usual, let’s check 2x2 + x + 2 using our TI84:

Page 211, Table of Contents

www.EconsPhDTutor.com

x2 + 5x + 4 > 0 ⇐⇒ one of the following is true: Example 167. −x2 − 2x + 1 1. “x2 + 5x + 4 > 0 AND −x2 − 2x + 1 > 0”; OR 2. “x2 + 5x + 4 < 0 AND −x2 − 2x + 1 < 0”.

The equation y = x2 + 5x + 4 has a ∪-shaped graph and has two real zeros given by: −5 ±

√

√ (5)2 − 4(1)(4) −5 ± 9 −5 ± 3 = = = −4, −1. 2(1) 2 2

Hence, the expression x2 + 5x + 4 > 0 ⇐⇒ “x < −4 OR x > −1”.

The equation y = −x2 − 2x + 1 has a ∩-shaped graph and has two real roots given by: 2±

√

√ √ (−2)2 − 4(−1)(1) 2 ± 8 = = −1 ∓ 2. 2(−1) −2

Hence, the expression −x2 + 4x − 1 > 0 ⇐⇒ “x ∈ (−1 −

√ √ 2, 2 − 1)”. Thus:

√ √ 1. “x2 +5x+4 > 0 AND −x2 −2x+1 > 0” ⇐⇒ “x < −4 OR x > −1 AND x ∈ (−1 − 2, 2 − 1)”. √ √ Since −1 − 2 < −1, this is equivalent to x ∈ (−1, 2 − 1). √ √ 2. “x2 + 5x + 4 < 0 AND −x2 − 2x + 1 < 0” ⇐⇒ “x ∈ (−4, −1) AND x < −1 − 2 or x > 2 − 1”. √ √ Since −1 − 2 < −1, this is equivalent to x ∈ (−4, −1 − 2). √ √ x2 + 5x + 4 Altogether then, > 0 ⇐⇒ x ∈ (−4, −1 − 2) ∪ (−1, 2 − 1). As usual, let’s −x2 − 2x + 1 check using our TI84:

Page 212, Table of Contents

www.EconsPhDTutor.com

x2 − 4x + 3 > 0 ⇐⇒ one of the following is true: Example 168. x2 − 2x

1. “x2 − 4x + 3 > 0 AND x2 − 2x > 0”; OR 2. “x2 − 4x + 3 < 0 AND x2 − 2x < 0”.

The equation y = x2 − 4x + 3 has a ∪-shaped graph and has two real zeros given by: 4±

√

√ (−4)2 − 4(1)(3) 4 ± 4 = = 1, 3. 2(1) 2

Hence, the expression x2 − 4x + 3 > 0 ⇐⇒ “x ∈ (1, 3)”.

The equation y = x2 − 2x has a ∪-shaped graph and has two real roots given by: 2±

√

√ (−2)2 − 4(1)(0) 2 ± 4 = = 0, 2. 2(1) 2

Hence, x2 − 2x > 0 ⇐⇒ “x ∈ (0, 2)”. Thus:

1. “x2 − 4x + 3 > 0 AND x2 − 2x > 0” ⇐⇒ “x ∈ (1, 3) AND x ∈ (0, 2)” ⇐⇒ ”x ∈ (1, 2)”. 2. “x2 − 4x + 3 < 0 AND x2 − 2x < 0” ⇐⇒ “x < 1 OR x > 3 AND x < 0 OR x > 2” ⇐⇒ “x < 0 or x > 3”.

x2 − 4x + 3 > 0 ⇐⇒ “x ∈ (−∞, 0) ∪ (1, 2) ∪ (3, ∞)”. As usual, let’s check Altogether then, x2 − 2x using our TI84:

Exercise 84. Without using a calculator, find the values of x for which each of the x2 + 2x + 1 x2 − 1 x2 − 3x − 18 following inequalities is true. (a) 2 > 0. (b) 2 > 0. (c) > 0. x − 3x + 2 x −4 −x2 + 9x − 14 2x + 5 −3x + 1 (d) > . (Answers on pp. 982, 983, 984, and 985.) −x + 4 6x − 7 Page 213, Table of Contents

www.EconsPhDTutor.com

18.3

Solving Inequalities by Graphical Methods

Example 169. For what values of x is x > sin (0.5πx)?

Rewrite the inequality as x − sin(0.5πx) > 0. Graph y = x − sin(0.5πx) on your graphing calculator. Our goal is to first find the horizontal intercepts of this equation; this will let us solve for x > sin (0.5πx). After Step 1.

After Step 2.

After Step 3.

After Step 4.

After Step 5.

After Step 6.

In the TI84: 1. Press ON to turn on your calculator. 2. Press Y= to bring up the Y= editor.

3. Press X,T,θ,n − SIN 0 . 5 . To enter “π”, press the blue 2ND button and then π

(which corresponds to the ∧ button). Now press X,T,θ,n ) and altogether you will have entered “x − sin(0.5πx)”.

4. Now press GRAPH and the calculator will graph y = x − sin(0.5πx).

It looks like the horizontal intercepts are close to the origin. Let’s zoom in to see better. 5. Press the (ZOOM) button to bring up a menu of ZOOM options.

6. Press 2 to select the Zoom In option. Nothing seems to happen. But now press ENTER and the TI will zoom in a little for you. It looks like there are 3 horizontal intercepts. To find out what precisely they are, we’ll use the TI84’s “zero” option. (... Example continued on the next page ...)

Page 214, Table of Contents

www.EconsPhDTutor.com

(... Example continued from the previous page ...) After Step 7.

After Step 8.

After Step 9.

After Step 11.

After Step 12.

After Step 13.

After Step 10.

3. Press the blue 2ND button and then CALC (which corresponds to the TRACE button). This brings up the CALCULATE menu. 4. Press 2 to select the “zero” option. This brings you back to the graph, with a cursor flashing. Also, the TI84 prompts you with the question: “Left Bound?” TI84’s ZERO function works by you first specifying a “Left Bound” and a “Right Bound” for x. TI84 will then check to see if there are any horizontal intercepts (i.e. values of x for which y = 0) within those bounds.

5. Using the < and > arrow keys, move the blinking cursor until it is where you want your first “Left Bound” to be. For me, I have placed it a little to the left of where I believe the leftmost horizontal intercept to be. 6. Press ENTER and you will have just entered your first “Left Bound”. TI84 now prompts you with the question: “Right Bound?”. 7. So now just repeat. Using the < and > arrow keys, move the blinking cursor until it is where you want your first “Right Bound” to be. For me, I have placed it a little to the right of where I believe the leftmost horizontal is. 8. Again press ENTER and you will have just entered your first “Right Bound”. TI84 now asks you: “Guess?” This is just asking if you want to proceed and get TI84 to work out where the horizontal intercept is. So go ahead and: 9. Press ENTER . TI84 now informs you that there is a “Zero” at “x = −1”, “y = 0” and places the blinking cursor at precisely that point. This is the first horizontal intercept we’ve found. To find each of the other 2 horizontal intercepts, just repeat steps 3 through 9. You should be able to find that they are at x = 0 and x = 1. Altogether, the 3 intercepts are x = −1, 0, 1. Based on these and what the graph looks like, we conclude: x > sin (0.5πx) ⇐⇒ x ∈ (−1, 0) ∪ (1, ∞). Page 215, Table of Contents

www.EconsPhDTutor.com

Example 170. For what values of x is x > e + ln x?

For this example, I won’t give the full detailed instructions of what to do on the TI84; I’ll only show a few screenshots. First, rewrite the inequality as x − e − ln x > 0 and so graph y = x − e − ln x on your graphing calculator: After Graphing. Zoom In, Adjust Window.

Look for the values of x for which x − e − ln x = 0. They are x = 0.7083, 4.1387:

Leftmost horizontal intercept. Rightmost horizontal intercept.

Based on these horizontal intercepts and what the graph looks like, we conclude: x > e+ln x if and only if x ∈ (0, 0.7083) ∪ (4.1387, ∞). Exercise 85. Use a graphing calculator to find the values of x for which each of the √ 1 > x3 + sin x. following inequalities is true. (a) x3 − x2 + x − 1 > ex . (b) x > cos x. (c) 2 1−x (Answers on p. 986.)

Page 216, Table of Contents

www.EconsPhDTutor.com

18.4

Systems of Equations

Warm-up questions: Exercise 86. (PSLE-style question.) When Apu was 40 years old, Beng was twice as old as Caleb. Today, Caleb is 28 years old and Apu is twice as old as Beng. What are the ages of Apu and Beng today? (If necessary, assume that the age of a person is always an integer and is fixed between January 1st and December 31st of each year.) (Answer on p. 987.) Exercise 87. (O-Level style question.) Planes A and B leave the same point at 12pm. Plane A travels northeast at a constant speed of 100 km/h. Plane B travels south at a constant speed of 200 km/h. At 3pm, both planes make an instant turn and start flying directly towards each other at the same speed. At what time will the two planes collide? (Answer on p. 987.)

Definition 48. Given an equation involving a single variable x, a real solution to the equation is any value of x ∈ R such that the equation is true.

Example 171. The equation x + 5 = 8 has one real solution: 3. The equation x2 − 1 = 0 has two real solutions: −1 and 1. The equation x2 − 1 = 8 has two real solutions: −3 and 3. The equation x3 − 4x = 0 has three real solutions: −2, 0, and 2. Example 172. The equation x2 + 1 = 0 has no real solution.

Definition 49. Given an equation involving a single variable x, a real solution set is the set of values of x ∈ R such that the equation is true.

Example 173. The real solution set of the equation x + 5 = 8 is {3}. The real solution set of the equation x2 − 1 = 0 is {−1, 1}. The real solution set of the equation x2 − 1 = 8 is {−3, 3}. The real solution set of the equation x3 − 4x = 0 is {−2, 0, 2}. Example 174. The real solution set of the equation x2 + 1 = 0 is ∅ = {}.

Page 217, Table of Contents

www.EconsPhDTutor.com

Definition 50. Given a system of equations (or more simply a set of equations) involving two variables x and y, a real solution to the set of equations is any point (or ordered pair) (x, y) with x, y ∈ R for which the system of equations is true; and a real solution set is the set of ordered pairs (x, y) for which the system of equations is true.

Example 175. Consider the system of equations y = x + 1, y = −x + 3. To solve this system of equations, plug in the second equation into the first to get: −x + 3 = x + 1. Now solve: x = 1. And so y = x + 1 = 2. Altogether, this system of equations has one real solution (1, 2). Its real solution set is thus {(1, 2)}. Example 176. Consider the system of equations y = 0.5x2 − 1.5 and y = x. To solve this system of equations, plug in the second equation into the first to get: x = 0.5x2 − 1.5. Rearranging: x2 − 2x − 3 = 0. Now solve: x = 3, −1. Correspondingly, y = 3, −1. Altogether, this system of equations has two real solutions: (3, 3) and (−1, −1). Its real solution set is thus {(3, 3), (−1, −1)}. A system of equations can have no real solutions.

Example 177. Consider the system of equations y = ln x and y = x. Observe that for all x ∈ (0, 1), ln x < 0 and hence x > ln x. Moreover, for x = 1, ln x = 0 < x. Also, for x > 1, 1 d d ln x = < 1 < x = 1, so the slope of y = x is steeper than that of y = ln x. Altogether dx x dx then, for all x > 0, x > ln x. Hence, this system of equations has no real solutions. Its real solution set is thus ∅ = {}. A system of equations can also have infinitely many real solutions.

Example 178. Consider the system of equations y = x and 2y = 2x. Observe that this system of equations has infinitely many real solutions, e.g. (1, 1), (2, 2), (2.74, 2.74). There is thus no way to explicitly list out all its real solutions. However, using set-builder notation, we can write down its real solution set as {(x, y} ∶ y = x}. This says that every ordered pair (x, y) such that y = x is a real solution to the given system of equations. Solve these problems without using a calculator. Exercise 88. The points (1, 2), (3, 5), and (6, 9) satisfy the equation y = ax2 + bx + c. What are a, b, and c? (Answer on p. 988.)

Exercise 89. The point (−1, 2) satisfies the equation y = ax2 + bx + c. Moreover, the minimum point of the equation y = ax2 + bx + c is (0, 0). What are a, b, and c? (Answer on p. 988.)

Page 218, Table of Contents

www.EconsPhDTutor.com

You are required to know how to use a graphing calculator to find the numerical solution of equations (including system of linear equations). Example 179. Solve the system of equations y = x4 − x3 − 5, y = ln x.

One method is to graph both equations on your graphing calculator and then find their intersection points.

Here I’ll use another method: First rewrite the two equations as a third equation y = x4 − x3 − 5 − ln x. Our goal is to find the horizontal intercepts of this equation, which will in turn also be the solutions to the above set of equations. Briefly, in the TI84:

1. Graph the equation y = x4 − x3 − 5 − ln x.

It looks like there is only one horizontal intercept. 2. Zoom in. 3. Find the horizontal intercept using the “zero” option. Conclusion: There is one solution to this set of equations and its x-coordinate is 1.8658. To find the y-coordinate, we need merely plug in this value of x into either of the equations in the original set of equations: y = ln x = ln 1.8658 ≈ 0.6237. Altogether, this set of equations has one solution: (1.8658, 0.6237). After Step 1.

After Step 2.

After Step 3.

Exercise 90. Using your graphing calculator, solve the following systems of equations. (Answers on pp. 989, 990, and 991.) (a) x2 + y 2 = 1, y = sin x. (b) y =

Page 219, Table of Contents

1 1 √ , y = x5 − x3 + 2. (c) y = , y = x3 + sin x. 2 1 − x 1+ x

www.EconsPhDTutor.com

Part II

Sequences and Series

Page 220, Table of Contents

www.EconsPhDTutor.com

19

Finite Sequences

Recall that an ordered pair (of real numbers) was simply any pair of real numbers, enclosed by parentheses, and whose order matters (and this was the only difference between an ordered pair and a set of two objects). Example 180. (1, 2) and (2, 1) are both ordered pairs with (1, 2) ≠ (2, 1).

We can analogously define ordered triples, quadruples, quintuples, etc.

Example 181. (1, 2, 3) and (2, 1, 3) are both ordered triples with (1, 2, 3) ≠ (2, 1, 3). (1, 1, 1, 1) and (2, 4, 1, 3) are both ordered quadruples with (1, 1, 1, 1) ≠ (2, 4, 1, 3).

(2, 2, 3, 2, 2) and (2, 4, 1, 5, 3) are both ordered quintuples with (2, 2, 3, 2, 2) ≠ (2, 4, 1, 5, 3).

We’ll simply call all of these ordered n-tuples or even simply tuples. Hence,

Example 182. (1, 2, 3), (2, 1, 3), (1, 1, 1, 1), (2, 4, 1, 3), (2, 2, 3, 2, 2), and (2, 4, 1, 5, 3) are all ordered n-tuples. (1, 2, 3) and (2, 1, 3) are ordered 3-ples or triples. (1, 1, 1, 1) and (2, 4, 1, 3) are ordered 4-tuples or quadruples. (2, 2, 3, 2, 2) and (2, 4, 1, 5, 3) are ordered 5-tuples or quintuples. In fact, when talking about tuples, it will be understood that they are ordered, so we’ll drop the word “ordered” and simply call them tuples (instead of ordered tuples). Definition 51. A finite sequence of length n is any n-tuple. Example 183. (1, 2, 3) and (2, 1, 3) are 3-ples or, equivalently, finite sequences of length 3. (1, 2, 3, 4) and (2, 4, 1, 3) are 4-tuples or, equivalently, finite sequences of length 4.

(1, 2, 3, 4, 5) and (2, 4, 1, 5, 3) are 5-tuples or, equivalently, finite sequences of length 5. We refer to the objects in a sequence as terms.

Example 184. Given the sequence (2, 1, 3), 2 is its first term, 1 is its second term, and 3 is its third term. Page 221, Table of Contents

www.EconsPhDTutor.com

19.1

A Corresponding Function for a Sequence

Another perspective is to think of a finite sequence of length n as a function whose domain is {1, 2, 3, . . . , n} and whose codomain is R.32

Example 185. (2, 4, 6, 8, 10, 12, 14) is a finite sequence of length 7, consisting of the first seven even positive integers. A corresponding function f for this sequence has • Domain {1, 2, 3, 4, 5, 6, 7}; • Codomain R; and

• Mapping rule f (n) = 2n, for all n.

Indeed, the values of the function f (1) = 2, f (2) = 4, f (3) = 6, ..., f (7) = 14 exactly list out the terms in the finite sequence (2, 4, 6, 8, 10, 12, 14).

Example 186. (2, 5, 12, 23, 38, 57, 80, 107, 138, 173) is a finite sequence of length 10. A corresponding function f for this sequence has • Domain {1, 2, 3, 4, 5, 6, 7, 8, 9, 10};

• Codomain R; and • Mapping rule : f (n) = 2n2 − 3n + 3, for all n.

Indeed, the values of the function f (1) = 2, f (2) = 5, f (3) = 12, f (4) = 23, ..., f (10) = 173 exactly list out the terms in the finite sequence (2, 5, 12, 23, 38, 57, 80, 107, 138, 173). Exercise 91. (Answer on p. 992.) For each of the following finite sequences, write down a corresponding function. (a) (1, 4, 9, 16, 25, 36, 49, 64, 81, 100). (b) (2, 5, 8, 11, 14, 17, 20).

(c) (0.5, 4, 13.5, 32, 62.5, 108, 171.5).

(d) (2, 6, 6, 12, 10, 18, 14, 24, 18, 30, 22, 36, 26, 42). (e) (18, 14.5).

32

Indeed, this is how a sequence is usually formally defined.

Page 222, Table of Contents

www.EconsPhDTutor.com

19.2

Recurrence Relations SYLLABUS ALERT

Recurrence relations are included in the 9740 (old) syllabus, but not in the 9758 (revised) syllabus. So you can skip this section if you’re taking 9758.

Example 187. (1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024) is a finite sequence of length 10. A corresponding function f for this sequence has • Domain {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}; • Codomain R; and

• Mapping rule : f (1) = 1 and f (n) = 2f (n − 1) (the recurrence relation), for all n ≥ 2.

The equation f (n) = 2f (n−1) is an example of a recurrence relation. That is, it describes how each term in the sequence is generated, depending on what previous terms were.

In this particular example of a sequence, we can easily write down another corresponding function that does not involve a recurrence relation:

Example 188. (1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024) is a finite sequence of length 10. A corresponding function g for this sequence has • Domain {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}; • Codomain R; and

• Mapping rule : g(n) = 2n−1 (not a recurrence relation), for all n. If we can describe a sequence without using a recurrence relation, then we can immediately compute what each term in the sequence is. So in the case of the finite sequence just given, we prefer to use the function g rather than the function f as a corresponding function. In contrast, with a recurrence relation, we need to know what some of the previous terms are, in order to compute each term. So if possible, we prefer to describe sequences without using recurrence relations. But sometimes, it is difficult to describe a sequence without using a recurrence relation.

Page 223, Table of Contents

www.EconsPhDTutor.com

Example 189. (1, 4, 10, 22, 46, 94, 190, 382, 766, 1534) is a finite sequence of length 10. A corresponding function f for this sequence has • Domain {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}; • Codomain R; and

• Mapping rule : f (1) = 1 and f (n) = 2f (n − 1) + 2 (the recurrence relation), for all n ≥ 2.

It is possible to describe the sequence just given without using a recurrence relation, but it does not come obviously (at least to the untrained eye) and takes a little work, as we’ll see. A recurrence relation can certainly involve more than just the previous term. In the Fibonacci sequence, each term (from the third term onwards) is the sum of the previous two terms: f (n) = f (n − 2) + f (n − 1). This equation is again a recurrence relation.

But in the past ten years’ exams, I haven’t seen a question where the recurrence relation involves more than just the previous term. So we shall not bother doing much of these.

Example 190. (1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89) is a finite sequence of length 11, consisting of the first 11 Fibonacci numbers. A corresponding function f for this sequence has • Domain {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11}; • Codomain R; and

• Mapping rule : f (n) = 1, for n = 1, 2; and f (n) = f (n − 2) + f (n − 1) (the recurrence relation), for all n ≥ 3. Exercise 92. Each of the following finite sequences involves a recurrence relation. (Hint: Each involves only the previous term and also a squared term.) Write down a corresponding function for each. (a) (3, 4, 9, 64, 3969). (b) (1, 2, 10, 290, 252010). (Answer on p. 993.)

Page 224, Table of Contents

www.EconsPhDTutor.com

19.3

Creating New Sequences

Notation. Consider the finite sequence of length k (a1 , a2 , a3 , . . . , ak ). A shorthand piece of notation for this sequence is (an )n≤k . We call n the index variable or dummy variable. We’ll assume the index variable always starts from 1, unless otherwise specified. Example 191. (1, 1, 1, 3, 5, 9, 17, 31, 57, 105, 193) is a finite sequence of length 11. We can also write it as (an )n≤11 = (a1 , a2 , a3 , . . . , a11 ), where a1 = 1, a2 = 1, a3 = 1, a4 = 3, a5 = 5, ..., a11 = 193.

Example 192. (1, 1, 1, 2, 2, 3, 4, 5, 7, 9, 12, 16, 21, 28, 37, 49, 65, 86, 114, 151) is a finite sequence of length 20. We can also write it as (bn )n≤20 = (b1 , b2 , b3 , . . . , b20 ), where b1 = 1, b2 = 1, b3 = 1, b4 = 2, b5 = 2, ..., b20 = 151.

Example 193. (2, 4, 6, 8, 10, 12, 14) is a finite sequence of length 7. We can also write it as (cn )n≤7 = (c1 , c2 , c3 , . . . , c7 ), where c1 = 2, c2 = 4, c3 = 6, ..., c7 = 14.

Example 194. (1, 1, 3, 5, 11, 21, 43, 85, 171, 341, 683) is a finite sequence of length 11. We can also write it as (dn )n≤11 = (d1 , d2 , d3 , . . . , d11 ), where d1 = 1, d2 = 1, d3 = 3, d4 = 5, d5 = 11, ..., d11 = 683.

We can create new sequences out of old ones, in the “obvious” fashion:

Example 195. Using the sequence (an )n≤11 = (1, 1, 1, 3, 5, 9, 17, 31, 57, 105, 193), here are some new sequences we can create: (zn )n≤11 = (an + 1)n≤11 = (a1 + 1, a2 + 1, a3 + 1, . . . , a11 + 1) = (2, 2, 2, 4, 6, 10, 18, 32, 58, 106, 194) = (z1 , z2 , z3 , . . . , z11 ) ,

(yn )n≤11 = (2an )n≤11 = (2a1 , 2a2 , 2a3 , . . . , 2a11 ) = (2, 2, 2, 6, 10, 18, 34, 62, 114, 210, 386) = (y1 , y2 , y3 , . . . , y11 ) ,

(xn )n≤11 = (an − 1)n≤11 = (a1 − 1, a2 − 1, a3 − 1, . . . , a11 − 1) = (0, 0, 0, 2, 4, 8, 16, 30, 56, 104, 192) = (x1 , x2 , x3 , . . . , x11 ) ,

(wn )n≤11 = (an /2)n≤11 = (a1 /2, a2 /2, a3 /2, . . . , a11 /2) = (1/2, 1/2, 1/2, 3/2, 5/2, 9/2, 17/2, 31/2, 57/2, 105/2, 193/2) = (w1 , w2 , w3 , . . . , w11 ) . Page 225, Table of Contents

www.EconsPhDTutor.com

Moreover, using two (or more) finite sequences that are of the same length, we can likewise create a new finite sequence (also of the same length), in the “obvious” fashion: Example 196. Using the sequences (an )n≤11 = (1, 1, 1, 3, 5, 9, 17, 31, 57, 105, 193) and (dn )n≤11 = (1, 1, 3, 5, 11, 21, 43, 85, 171, 341, 683), here are some new sequences we can create: (en )n≤11 = (an + dn )n≤11 = (a1 + d1 , a2 + d2 , a3 + d3 , . . . , a11 + d11 ) = (2, 2, 4, 8, 16, 30, 60, 116, 228, 446, 876) = (e1 , e2 , e3 , . . . , e11 ) ,

(fn )n≤11 = (an ⋅ dn )n≤11 = (a1 ⋅ d1 , a2 ⋅ d2 , a3 ⋅ d3 , . . . , a11 ⋅ d11 ) = (1, 1, 3, 15, 55, 189, . . . , 131819) = (f1 , f2 , f3 , . . . , f11 ) ,

(gn )n≤11 = (an − dn )n≤11 = (a1 − d1 , a2 − d2 , a3 − d3 , . . . , a11 − d11 ) = (0, 0, −2, −2, −6, −12, −26, . . . , −490) = (g1 , g2 , g3 , . . . , g11 ) ,

(hn )n≤11 = (an /dn )n≤11 = (a1 /d1 , a2 /d2 , a3 /d3 , . . . , a11 /d11 ) = (1, 1, 1/3, 3/5, 5/11, 9/21, . . . , 193/683) = (h1 , h2 , h3 , . . . , h11 ) .

There are of course many other new sequences we can create, whether using only one sequence, using two sequences, or even using three or more sequences. Remark 6. You cannot create a new sequence using two finite sequences that are of different lengths. For example, given two finite sequences (an )n≤11 = (1, 1, 1, 3, 5, 9, 17, 31, 57, 105, 193) and (cn )n≤7 = (2, 4, 6, 8, 10, 12, 14), there is no such sequence as (an + cn )n≤11 or even (an + cn )n≤7 . Either of these supposed sequences is simply undefined. It turns out that we are rarely interested in finite sequences. Instead, we are much more interested in infinite sequences, which is a simple extension of the concept of finite sequences.

Page 226, Table of Contents

www.EconsPhDTutor.com

20

Infinite Sequences

We can easily extend the concept of finite sequences to infinite sequences, which have domain Z+ = {1, 2, 3, 4, . . . } (the entire set of positive integers). Example 197. (2, 4, 6, 8, 10, 12, 14, 16, 18, . . . ) is the infinite sequence consisting of all the even positive integers. A corresponding function f for this sequence has • Domain Z+ ; • Codomain R; and

• Mapping rule f (n) = 2n for all n.

Example 198. (1, 3, 6, 10, 15, 21, 28, 36, 45, 55, . . . ) is the infinite sequence consisting of the triangular numbers. A corresponding function f for this sequence has • Domain Z+ ;

• Codomain R; and

• Mapping rule f (1) = 1 and f (n) = 1 + 2 + ⋅ ⋅ ⋅ + n for all n ≥ 2.

Example 199. The infinite sequence (1, 2, 6, 24, 120, 720, 5040, ...) has the corresponding function f with

• Domain Z+ ;

• Codomain R; and • Mapping rule f (n) = 1 × 2 × ⋅ ⋅ ⋅ × n = n! for all n.

Exercise 93. For each of the following infinite sequences, write down a corresponding function. (a) (1, 4, 9, 16, 25, 36, 49, 64, 81, 100, . . . ). (b) (2, 5, 8, 11, 14, 17, 20, . . . ). (c) (0.5, 4, 13.5, 32, 62.5, 108, 171.5, . . . ). (d) (2, 6, 6, 12, 10, 18, 14, 24, 18, 30, 22, 36, 26, 42, . . . ). (Answer on p. 994.)

Page 227, Table of Contents

www.EconsPhDTutor.com

20.1

Creating New Sequences

(an ) is our shorthand notation for an infinite sequence, where (an ) = (a1 , a2 , a3 , . . . ).

As stated, we are rarely interested in finite sequences. And so whenever we talk about a sequence, it should be assumed that we are talking about an infinite sequence, unless otherwise clearly stated. The idea of creating new sequences carries over from the finite case in the “obvious” fashion. Example 200.

Let and Then

(an ) = (1, 1, 2, 3, 5, 8, 13, 21, 34, 55, . . . ) (bn ) = (2, 4, 6, 8, 10, 12, 14, 16, 18, 20, . . . ) .

(an + bn ) = (3, 5, 8, 11, 15, 20, 27, 37, 52, 75, . . . ) .

Analogous to Remark 6, you cannot create a new sequence using a finite sequence and an infinite sequence. Instead, you can only create one using two infinite sequences. Example 201.

Let and Then

Page 228, Table of Contents

(an ) = (1, 1, 2, 3, 5, 8, 13, 21, 34, 55, . . . ) (bn )n≤7 = (2, 4, 6, 8, 10, 12, 14) .

(an + bn ) is undefined.

www.EconsPhDTutor.com

21

Series

Definition 52. Given a finite sequence (an )n≤k , its series is the expression a1 + a2 + a3 + ⋅ ⋅ ⋅ + ak .

We refer to a1 as the first term of the sequence and also as the first term of the series. Similarly, a2 is the second term of both the sequence and the series. Etc. Definition 53. Given a finite sequence (an )n≤k , its sum of series is the number S such that S = a1 + a2 + a3 + ⋅ ⋅ ⋅ + ak . Example 202.

Given the sequence its series is the expression and its sum of series is the number

Example 203.

Given the sequence its series is the expression and its sum of series is the number

(an )n≤8 = (1, 1, 1, 3, 5, 9, 17, 31) , 1 + 1 + 1 + 3 + 5 + 9 + 17 + 31 68.

(bn )n≤11 = (2, 4, 6, 8, 10, 12, 14) , 2 + 4 + 6 + 8 + 10 + 12 + 14 56.

It may seem strange and unnecessary to distinguish between a series and a sum of series. Aren’t they exactly the same thing? It turns out that expressions like a1 + a2 + a3 + ⋅ ⋅ ⋅ + ak play an important role in maths and so we want to reserve a special name for the expression itself and distinguish it from the sum of series. For example, we might be specifically interested in the series 1 + 2 + 3, rather than just the sum of series 6. Clearly, every finite sequence has a well-defined sum of series – simply add up all the terms in the finite sequence!

Definition 54. Given an infinite sequence (an ), its series is the expression a1 + a2 + a3 + . . . .

A series that corresponds to a finite sequence is called a finite series, while a series that corresponds to an infinite sequence is called an infinite series.

Page 229, Table of Contents

www.EconsPhDTutor.com

21.1

Convergent and Divergent Sequences and Series

Every finite sequence has a sum of series. In contrast, not all infinite sequences do: Example 204. Consider the sequence (an ) = (1, 1, 1, 1, 1, 1, . . . ). Its series is the expression 1 + 1 + 1 + 1 + 1 + . . . . There is no number equal to 1 + 1 + 1 + 1 + 1 + . . . and so a sum of series does not exist for this sequence. But some infinite sequences do have sums of series: Example 205. Consider the sequence (bn ) = (0, 0, 0, 0, 0, 0, . . . ). Its series is the expression 0 + 0 + 0 + 0 + 0 + . . . . The sum of series for this sequence exists and is 0. Definition 55. An infinite sequence for which a sum of series exists is said to have a convergent series. An infinite sequence for which no sum of series exists is said to have a divergent series. So in the above examples, we say that the sequence (an ) has a divergent series (because its sum of series does not exist), while the sequence (bn ) has a convergent series (because its sum of series exists).

Page 230, Table of Contents

www.EconsPhDTutor.com

But what exactly is a convergent series? When exactly is a series convergent? These are actually fascinating questions, which means, of course, that they’re not in the syllabus. Here is a simple example that gives you a glimpse of the difficulties involved. Chapter 81 in the Appendices (optional) gives the precise definitions of when a series converges or diverges. Example 206. Consider the sequence (cn ) = (1, −1, 1, −1, 1, −1, . . . ), where the terms simply alternate between 1 and −1. Its series is the expression 1 − 1 + 1 − 1 + 1 − 1 + . . . . Is there any number that is equal to 1 − 1 + 1 − 1 + 1 − 1 + . . . ? It’s actually not obvious. On the one hand, we can pair together every two terms like so: 1 − 1 + 1 − 1 + 1 − 1 + . . . = (1 − 1) + (1 − 1) + (1 − 1) + . . . ´¹¹ ¹ ¹ ¸ ¹ ¹ ¹ ¶ ´¹¹ ¹ ¹ ¸ ¹ ¹ ¹ ¶ ´¹¹ ¹ ¹ ¸ ¹ ¹ ¹ ¶ 0

0

0

= 0 + 0 + 0 + ...

and happily conclude that the sum of series is 0. But wait a minute ... what if we instead pair together every two terms like so: 1 − 1 + 1 − 1 + 1 − 1 + 1 . . . = 1 + (−1 + 1) + (−1 + 1) + (−1 + 1) + . . . ´¹¹ ¹ ¹ ¹ ¹ ¹¸¹ ¹ ¹ ¹ ¹ ¹ ¶ ´¹¹ ¹ ¹ ¹ ¹ ¹¸¹ ¹ ¹ ¹ ¹ ¹ ¶ ´¹¹ ¹ ¹ ¹ ¹ ¹¸¹ ¹ ¹ ¹ ¹ ¹ ¶ 0

= 1 + 0 + 0 + 0 + ...

0

0

Then we’d have to conclude that the sum of series is 1!

It turns out that the sequence (cn ) = (1, −1, 1, −1, 1, −1, . . . ) is divergent. Or equivalently, a sum of series simply does not exist for this sequence.

Page 231, Table of Contents

www.EconsPhDTutor.com

22

Summation Notation Σ

Σ is the upper-case Greek letter sigma. An enlarged version of that letter ∑, read aloud as “sum”, is used to express series in compact notation: Example 207. Consider the series 1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9. Another way to write it is to use summation notation: 9

1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 = ∑ n. n=1

Let’s examine the expression on the RHS.

The variable n below the ∑ is called the index variable or dummy variable. We could have named it p or z or x or any other letter (instead of n) and it wouldn’t have mattered. Hence the name “dummy”. The “= 1” below the ∑ says that we start counting the index variable from n = 1. We call the number “1” the starting point.

The “9” above the ∑ is called the stopping point. It says that we should stop adding once we hit n = 9. 9

Altogether, the notation ∑ says that we are adding up 9 terms, namely a1 , a2 , ..., a9 . n=1

The expression to the right of the ∑ tells us what each an is. In this example, it is n, which simply says that for every n, an = n. 9

Altogether, ∑ n says that we add up a1 through a9 , where each an is simply equal to n. n=1

Example 208. The series 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 can be written as 9

∑ 1.

n=1

This says that the starting point is 1 and the ending point is 9. In other words, we add up a1 , a2 , . . . , a9 , where for each n, an = 1. And so a1 = 1, a2 = 1, etc. Altogether: 9

∑ 1 = a1 + a2 + ⋅ ⋅ ⋅ + a9 = 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1.

n=1

Page 232, Table of Contents

www.EconsPhDTutor.com

Example 209. The series 2 + 4 + 6 + 8 + 10 + 12 + 14 can be written as 7

∑ 2n.

n=1

This says that the starting point is 1 and the ending point is 7. In other words, we add up a1 , a2 , . . . , a7 , where for each n, an = 2n. And so a1 = 2, a2 = 4, etc. Altogether: 7

∑ 2n = a1 + a2 + ⋅ ⋅ ⋅ + a7 = 2 × 1 + 2 × 2 + ⋅ ⋅ ⋅ + 2 × 7 = 2 + 4 + 6 + 8 + 10 + 12 + 14.

n=1

7

The series 3 + 5 + 7 + 8 + 11 + 13 + 15 can be rewritten as ∑ (2n + 1) — the parentheses help 7

to clarify that we are not talking about 1 + ∑ 2n.

n=1

n=1

This says that the starting point is 1 and the ending point is 7. In other words, we add up a1 , a2 , . . . , a7 , where for each n, an = 2n + 1. And so a1 = 3, a2 = 5, etc. Altogether: ³¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹·¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ µ ³¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹·¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ µ ³¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹·¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ µ ∑ (2n + 1) = a1 + a2 + ⋅ ⋅ ⋅ + a7 = (2 × 1 + 1) + (2 × 2 + 1) + ⋅ ⋅ ⋅ + (2 × 7 + 1). 3

7

5

15

n=1

Example 210. The series 2 + 4 + 8 + 16 + 32 + 64 + 128 + 256 + 512 + 1024 can be written as 10

∑ 2n .

n=1

This says that the starting point is 1 and the ending point is 10. In other words, we add up a1 , a2 , . . . , a10 , where for each n, an = 2n . And so a1 = 2, a2 = 4, etc. Altogether: 10

∑ 2n = a1 + a2 + ⋅ ⋅ ⋅ + a10 = 21 + 22 + 23 + ⋅ ⋅ ⋅ + 210 = 2 + 4 + 8 + ⋅ ⋅ ⋅ + 1024.

n=1

Page 233, Table of Contents

www.EconsPhDTutor.com

It’s nice to have 1 as the starting point, but there’s no reason why this must always be so. Example 211. The series 1 + 2 + 4 + 8 + 16 + 32 + 64 + 128 + 256 + 512 + 1024 can be written as 10

∑ 2n .

n=0

This says that the starting point is 0 and the ending point is 10. In other words, we add up a0 , a1 , a2 , . . . , a10 , where for each n, an = 2n . And so a0 = 1, a1 = 2, a2 = 4, etc. Altogether: 10

∑ 2n = a0 + a1 + a2 + ⋅ ⋅ ⋅ + a10 = 20 + 21 + 22 + ⋅ ⋅ ⋅ + 210 = 1 + 2 + 4 + ⋅ ⋅ ⋅ + 1024.

n=0

Exercise 94. Rewrite each of the following in summation notation. (Answer on p. 995.) (a) 1 + 4 + 9 + 16 + 25 + 36 + 49 + 64 + 81 + 100. (b) 2 + 5 + 8 + 11 + 14 + 17 + 20 + 23. (c) 0.5 + 4 + 13.5 + 32 + 62.5 + 108 + 171.5.

Exercise 95. Find the sum of each of the following series. (Answer on p. 995.) 5

(a) ∑ (2 − n) . n

n=−2

17

(b) ∑ (4n + 5). n=16

33

(c) ∑ (x − 3). x=31

Here’s the general definition of the summation notation: Definition 56. Let s, k be integers with s ≤ k. Let f be a real-valued function whose domain contains s, s + 1, . . . , k. Then k

∑ f (n) = f (s) + f (s + 1) + ⋅ ⋅ ⋅ + f (k).

n=s

Page 234, Table of Contents

www.EconsPhDTutor.com

23

Arithmetic Sequences and Series

Example 212. Consider the finite sequence (4, 7, 10, 13, 16, 19, 22). A corresponding function f for this sequence has • Domain {1, 2, 3, 4, 5, 6, 7} (a subset of Z+ );

• Codomain R; and • Mapping rule f (1) = 4 and f (n) − f (n − 1) = 3 for all n ≥ 2. This is an example of a finite arithmetic sequence.

Example 213. Consider the infinite sequence (4, 7, 10, 13, 16, 19, 22, 25, 28, 31, 34, . . . ). A corresponding function f for this sequence has • Domain Z+ ; • Codomain R; and

• Mapping rule f (1) = 4 and f (n) − f (n − 1) = 3 for all n ≥ 2. This is an example of an infinite arithmetic sequence.

Definition 57. An arithmetic sequence (or an arithmetic progression) is any finite or infinite sequence (an ) where an+1 − an is a constant for all n = 1, 2, 3, . . . . We call an+1 − an the common difference. We call the series for an arithmetic sequence an arithmetic series. And its sum of series (if it exists at all) is called an arithmetic sum of series. Example 214. The sequence (an ) = (1, 4, 7, 10, 13, 16, 19, . . . ) is an arithmetic sequence because an+1 − an is constant for n = 1, 2, 3, . . . .

But the sequence (bn ) = (1, 1, 4, 7, 10, 13, 16, 19, . . . ) is not an arithmetic sequence because a2 − a1 = 0 ≠ a3 − a2 = 3.

The next fact is intuitively obvious. Clearly, there is no number for which, for example, 4 + 7 + 10 + 13 + 16 + 19 + 22 + . . . is equal to.

Fact 12. The infinite arithmetic sequence (an ) has no sum of series, except in the trivial case where (an ) = (0, 0, 0, 0, 0, 0, . . . ).

Page 235, Table of Contents

www.EconsPhDTutor.com

23.1

Finite Arithmetic Sequences and Series

Every finite sequence, including arithmetic ones, has a sum of series. Example 215. You’ve probably heard of the apocryphal story about an eight-year-old Gauss adding up the numbers from 1 to 100 in an instant. The trick is to pair the first number with the last, the second number with the second last, etc. then use multiplication. Like this: 50 terms ³¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹· ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ µ 1 + 2 + 3 + 4 + ⋅ ⋅ ⋅ + 100 = (1 + 100) + (2 + 99) + (3 + 98) + ⋅ ⋅ ⋅ + (50 + 51) ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶ ´¹¹ ¹ ¹ ¹ ¹ ¸¹ ¹ ¹ ¹ ¹ ¹ ¶ ´¹¹ ¹ ¹ ¹ ¹ ¸¹ ¹ ¹ ¹ ¹ ¹ ¶ ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶ 101

101

101

101

= 101 × 50 = 5050.

In general, there is a simple formula for the sum of a finite arithmetic series: (First Term + Last Term) × (Number of Terms) ÷ 2. k Fact 13. The finite arithmetic series a1 + a2 + ⋅ ⋅ ⋅ + ak has sum of series (a1 + ak ) . 2 (We will only prove Fact 13 on p. 244.) Example 216. Consider the arithmetic sequence (7, 17, 27, 37, . . . , 837). Its common difference is 10. The difference between the first and last terms is 830. And so the last term is 830 ÷ 10 = 83 terms after the first. Hence, there are in total 84 terms. By Fact 13, its 84 sum of series is (7 + 837) × = 35488. 2 Example 217. Consider the arithmetic sequence (1, 5, 9, 13, 17, 21, 25, 29, 33, . . . , 393). Its common difference is 4. The difference between the first and last terms is 392. And so the last term is 392 ÷ 4 = 98 terms after the first. Hence, there are in total 99 terms. By Fact 99 13, its sum of series is (1 + 393) × = 19503. 2 Exercise 96. Rewrite each of the following arithmetic series in summation notation and compute their sums. (a) 2+7+12+17+22+27+32+⋅ ⋅ ⋅+997. (b) 3+20+37+54+71+⋅ ⋅ ⋅+1703. (c) 81 + 89 + 97 + 105 + 113 + ⋅ ⋅ ⋅ + 8081 (Answer on p. 996.) Page 236, Table of Contents

www.EconsPhDTutor.com

24

Geometric Sequences and Series

Example 218. Consider the finite sequence (1, 2, 4, 8, 16, 32, 64, 128). A corresponding function f for this sequence has

• Domain {1, 2, 3, 4, 5, 6, 7, 8} (a subset of Z+ );

• Codomain R; and • Mapping rule f (1) = 1 and f (n + 1) ÷ f (n) = 2 for all n = 1, 2, 3, . . . . This is an example of a finite geometric sequence.

Example 219. Consider the finite sequence (1, 2, 4, 8, 16, 32, 64, 128, 256, 512, . . . ). A corresponding function f for this sequence has • Domain Z+ ; • Codomain R; and

• Mapping rule f (1) = 1 and f (n + 1) ÷ f (n) = 2 for all n = 1, 2, 3, . . . .

This is an example of a infinite geometric sequence.

Definition 58. A geometric sequence (or a geometric progression) is any sequence (an ) where an+1 ÷ an is constant for all n = 1, 2, 3, . . . . We call an+1 ÷ an the common ratio. We call the series for a geometric sequence a geometric series. And its sum of series (if it exists at all) is called a geometric sum of series. Example 220. The sequence (an ) = (1, 2, 4, 8, 16, 32, . . . ) is a geometric sequence because an+1 ÷ an is constant for all n = 1, 2, 3, . . . . But the sequence (bn ) = (1, 1, 2, 4, 8, 16, 32, . . . ) is not a geometric sequence because a2 ÷a1 = 1 ≠ a3 ÷ a2 = 2.

Page 237, Table of Contents

www.EconsPhDTutor.com

24.1

Finite Geometric Sequences and Series

It turns out that just like with finite arithmetic series, there is a nice formula for the finite geometric series. Let’s start with the simple case first where the first term is simply 1. Fact 14. 1 + r + r + r + ⋅ ⋅ ⋅ + r 2

3

n−1

1 − rn . = 1−r

Proof. Let S = 1 + r + r2 + ⋅ ⋅ ⋅ + rn−1 . Then rS = r + r2 + r3 + ⋅ ⋅ ⋅ + rn . Now take the difference: S − rS = 1 − rn . 1 − rn Hence, S = . 1−r

The trick used in the above proof is called the method of differences and the A-level syllabus requires you to know it. The general case of a geometric series follows immediately from the above: Fact 15. a1 + a1 r + a1 r + a1 r + ⋅ ⋅ ⋅ + a1 r 2

3

n−1

1 − rn = a1 . 1−r

Example 221. Consider the geometric sequence (1, 2, 4, 8, 16, . . . , 1024). Its common ratio is 2. The ratio of the last term to the first is 1024 ÷ 1 = 1024 = 210 . And so the last term is 10 terms after the first. Hence, there are in total 11 terms. Thus, its sum of series is 1 − 211 −2047 1× = = 2047. 1−2 −1 Example 222. Consider the geometric sequence (4, 12, 36, 108, . . . , 8748). Its common ratio is 3. The ratio of the last term to the first is 8748 ÷ 4 = 2187 = 37 . And so the last term is 7 terms after the first. Hence, there are in total 8 terms. Thus, its sum of series is 1 − 38 −6560 4× =4× = 4 × 3280 = 13120. 1−3 −2 Exercise 97. Rewrite each of the following geometric series into summation notation and compute their sums. (a) 7 + 14 + 28 + 56 + ⋅ ⋅ ⋅ + 448 + 896. (b) 20 + 10 + 5 + ⋅ ⋅ ⋅ + 5/8. (c) 1 + 1/3 + 1/9 + ⋅ ⋅ ⋅ + 1/243. (Answer on p. 997.) Page 238, Table of Contents

www.EconsPhDTutor.com

24.2

Infinite Geometric Sequences and Series

Perhaps surprisingly, it turns out that under a certain condition, an infinite geometric sequence can have a sum of series. Again, let’s start with the simple case: Fact 16. If ∣r∣ < 1, then 1 + r + r2 + r3 + ⋅ ⋅ ⋅ =

1 . 1−r

Proof. Write the series as S = 1 + r + r2 + r3 + . . . . Then rS = r + r2 + r3 + r4 + . . . . (By the ∞

∞

way, we can also use summation notation for infinite series: S = ∑ r and S = ∑ rn+1 .) n=0

n

n=0

Since ∣r∣ < 1, it follows that as n → ∞, rn → 0. Hence, if we take the difference, we have simply S − rS = 1. And so, S =

1 . 1−r

The general case follows immediately: Fact 17. If ∣r∣ < 1, then a1 + a1 r + a1 r2 + a1 r3 + ⋅ ⋅ ⋅ =

a1 . 1−r

The converse is also true: Fact 18. If ∣r∣ ≥ 1, then a1 + a1 r + a1 r2 + a1 r3 + . . . diverges. Proof. Optional, see p. 855 in the Appendices.

Exercise 98. Rewrite each of the following infinite geometric series in summation notation and compute its sum. (a) 6 + 9/2 + 27/8 + . . . . (b) 20 + 10 + 5 + . . . . (c) 1 + 1/3 + 1/9 + . . . . (Answer on p. 997.)

Page 239, Table of Contents

www.EconsPhDTutor.com

25

Proof by the Method of Mathematical Induction SYLLABUS ALERT

Proof by the method of mathematical induction is included in the 9740 (old) syllabus, but not in the 9758 (revised) syllabus. So you can skip this Chapter if you’re taking 9758. We’ll now learn a new technique called proof by the method of mathematical induction. It’s pretty difficult, so go real slow.33 Imagine an infinite chain of dominos. Our goal is to knock all of them down. Suppose we manage to do two things: 1. “Knock down the 1st domino” (the base case). 2. Prove that “if the jth domino is knocked down, then so too is the (j + 1)th domino” (the inductive step).

Then we will have succeeded. Because once the 1st domino is knocked down, the inductive step implies that the 2nd domino is also knocked down, and now again by the inductive step the 3rd domino is also knocked down, and now again by the inductive step the 4th domino is also knocked down, ..., ad infinitum (to infinity).

33

Which is perhaps why they decided to drop it from the revised 9758 syllabus! It does appear though as the first topic of Further Maths, which will be revived in 2017 and for which a free textbook will soon be appearing!

Page 240, Table of Contents

www.EconsPhDTutor.com

The metaphor of dominos is an apt description of the method of mathematical induction, which I’ll standardise into a three-step recipe:

The Method of Mathematical Induction Step #1. Let P(k) be (shorthand for) the proposition to be proven. Our goal is to show that P(k) is true for all k = 1, 2, 3, . . . Step #2 (the base case). Verify that P(1) is true.

Step #3 (the inductive step). Show that P(j) implies P(j + 1) (for all j = 1, 2, 3, . . . ). Step #1 rarely involves much work. Step #2 is usually, but not always, very easy. Step #3 is usually the hardest part — on the A-level exams, it usually just involves some (or a lot of) algebra. Why does the method of mathematical induction work? Step #2 (the base case) shows that P(1) is true (“knock down the 1st domino”). Step #3 (the inductive step) then implies that P(2) is also true (“the falling 1st domino knocks down the 2nd domino”). Step #3 (the inductive step) then implies that P(3) is also true (“the falling 2nd domino knocks down the 3rd domino”). Step #3 (the inductive step ) then implies that P(4) is also true (“the falling 3rd domino knocks down the 4th domino”). Ad infinitum (to infinity). Thus, we have proven that P(k) is true for all k = 1, 2, 3, . . . , as desired.

Too abstract? Work through all the examples and exercises and you should find that it is not very difficult. For our first example, we’ll reprove an earlier fact, but now using the method of mathematical induction.

Page 241, Table of Contents

www.EconsPhDTutor.com

Fact 14 (reproduced from p. 238). 1 + r + r2 + r3 + ⋅ ⋅ ⋅ + rn−1 =

1 − rn . 1−r

Proof. Step #1. Let P(k) be (shorthand for) the proposition that 1 + r + r2 + r3 + ⋅ ⋅ ⋅ + rk−1 =

1 − rk . 1−r

Our goal is to show that P(k) is true for all k = 1, 2, 3, . . . Step #2. Verify that P(1) is true. 1=

1 − r1 . 1−r

✓

Step #3. Show that P(j) implies P(j + 1) (for all j = 1, 2, 3, . . . ). Assume that P(j) is true. That is,

1 + r + r + r + ⋅⋅⋅ + r 2

3

1 − rj . = 1−r

j−1 1

Our goal is to show that P(j + 1) is true. That is,

1 − rj+1 1 + r + r + r + ⋅⋅⋅ + r = . 1−r 2

To this end, write:

3

j

1 + r + r2 + r3 + ⋅ ⋅ ⋅ + rj = (1 + r + r2 + r3 + ⋅ ⋅ ⋅ + rj−1 ) + rj

1 − rj 1 − rj + (1 − r)rj 1 − rj+1 j = +r = = , 1−r 1−r 1−r 1

as desired.

In this particular instance, the method of mathematical induction was terribly cumbersome, compared to our earlier four-sentence proof (p. 238). But it turns out that in many other instances, this method is the best and sometimes the only tool to use. Let’s try more examples. Page 242, Table of Contents

www.EconsPhDTutor.com

n

Example 223. Prove that ∑ r2 = r=1

n(n + 1)(2n + 1) . 6 k

Step #1. Let P(k) be (shorthand for) the proposition that ∑ r2 = Our goal is to show that P(k) is true for all k = 1, 2, 3, . . . 1

Step #2. Verify that P(1) is true: ∑ r2 = 1 = n=1

r=1

k(k + 1)(2k + 1) . 6

1(1 + 1)(2 × 1 + 1) . ✓ 6

Step #3. Show that P(j) implies P(j + 1) (for all j = 1, 2, 3, . . . ). j

Assume that P(j) is true. That is, ∑ r2 = n=1

1

j(j + 1)(2j + 1) . 6

Our goal is to show that P(j + 1) is true. That is, j+1

∑ r2 =

n=1

(j + 1) [(j + 1) + 1] [2(j + 1) + 1] . 6

To this end, write: j+1

j

∑ r2 = ∑ r2 + (j + 1)2 2

n=1

n=1

= 3

= 6

= 7

= 5

= 4

j(j + 1)(2j + 1) + (j + 1)2 6 j+1 [j(2j + 1) + 6(j + 1)] 6 j+1 (2j 2 + 7j + 1) 6 (j + 1)(j + 2)(2j + 3) 6 (j + 1) [(j + 1) + 1] [2(j + 1) + 1] , 6

(Using =) 1

as desired.

I just used the “backwards-forwards method”. The order in which I wrote down each line is given by the numbers above each = sign.

Another trick is to exploit the fact that it has got to work out right. So for example, it might not immediately be obvious that 2j 2 + 7j + 1 = (j + 2)(2j + 3), but you know it has got to work out right and thus this must surely be true (unless of course you made some mistake with the algebra somewhere). And if you expand the RHS, you find that this equation is indeed true. Page 243, Table of Contents

www.EconsPhDTutor.com

Fact 13 (reproduced from p. 236). The finite arithmetic sequence (an )n≤k has sum of k series (a1 + ak ) . 2 Proof. Step #1. Let P(k) be (shorthand for) the proposition that a1 + a2 + ⋅ ⋅ ⋅ + ak =

k(a1 + ak ) . 2

Our goal is to show that P(k) is true for all k = 1, 2, 3, . . . Step #2. Verify that P(1) is true: a1 =

1(a1 + a1 ) . ✓ 2

Step #3. Show that P(j) implies P(j + 1) (for all j = 1, 2, 3, . . . ). Assume that P(j) is true. That is,

a1 + a2 + ⋅ ⋅ ⋅ + aj = 1

j(a1 + aj ) . 2

Our goal is to show that P(j + 1) is true. That is, a1 + a2 + ⋅ ⋅ ⋅ + aj+1 =

(j + 1)(a1 + aj+1 ) . 2

Let’s first observe that aj − a1 = (j − 1) (aj+1 − aj ). In words, this equation says: Consider the difference between the j th term and the first term; it is equal to j −1 times the difference 2 (j − 1)aj+1 + a1 between two consecutive terms. Rearranging, we have aj = . j Now write:

a1 + a2 + ⋅ ⋅ ⋅ + aj+1

j(a1 + aj ) + aj+1 2 j {a1 + [(j − 1)aj+1 + a1 ] /j} ja1 + (j − 1)aj+1 + a1 = + aj+1 = + aj+1 2 2 (j + 1)a1 + (j − 1)aj+1 (j + 1)a1 + (j − 1)aj+1 + 2aj+1 = + aj+1 = 2 2 (j + 1)a1 + (j + 1)aj+1 (j + 1)(a1 + aj+1 ) = = , as desired. 2 2 = (a1 + a2 + ⋅ ⋅ ⋅ + aj ) + aj+1

Page 244, Table of Contents

=

www.EconsPhDTutor.com

Exercise 99. (Answer on p. 998.) Prove that n(n + 1) ] . ∑r = [ 2 r=1 n

n

2

3

n

2

By the way, this shows that ∑ r = (∑ r) . r=1

3

r=1

Exercise 100. (Answer on p. 999.) Let a ∈ R. Prove that n

∑ rar = a

r=1

1 − (n + 1)an + nan+1 , (1 − a)2

Exercise 101. (Answer on p. 1000.) Prove that

n(n + 1)(2n + 1)(3n2 + 3n − 1) . ∑r = 30 r=1 n

Page 245, Table of Contents

4

www.EconsPhDTutor.com

Part III

Vectors

Page 246, Table of Contents

www.EconsPhDTutor.com

26

Quick Revision of Some O-Level Maths 26.1

Lines vs. Line Segments vs. Rays

A line is infinite, while a line segment is finite. Example 224. The line running through points a and b goes forever, in both directions (red dotted line). In contrast, the line segment ab is finite. The line ab is a different mathematical object from the line segment ab.

a

b

The length of the line segment ab is thus a well-defined concept. In contrast, it makes no sense to talk about the length of the line ab. A ray is a portion of a line, beginning at some point along the line, then going towards infinity. You can think of a ray as a half-infinite-line. The figure above illustrates in grey the ray that starts from the point a and goes in the direction b. This textbook will strictly reserve the word ray to mean a half-infinite-line. But you should know that some other writers use ray to mean a (finite) line segment.

Page 247, Table of Contents

www.EconsPhDTutor.com

26.2 Angles are Measured in Radians 26.2 Angles are Measured in Radians

Professional mathematicians do not use the degree ○ to measure angles; instead, they use ○ the radian.mathematicians This textbook will follow Professional do not use professional the degree practice. to measure angles; instead, they use the radian. This textbook will follow professional practice. The radian is defined to be a ratio of one length to another. It is thus a “unitless” unit. The radian is defined to be a ratio of one length to another. It is thus a “unitless” unit. Definition 59. The magnitude, in radians, of an angle subtended by an arc, is the ratio of the length of The the arc to the length of the of radius of the circle. by an arc, is the ratio Definition 59. magnitude, in radians, an angle subtended of the length of the arc to the length of the radius of the circle. Example 225. The circle below has radius r and thus circumference 2πr. Example 225. The circle below has radius r and thus circumference 2πr.

Inradians, radians,the theangles anglesPP, ,Q, Q,and andRRare aregiven givenby: by: In ab πr/2 πr/2 ππ πr ab cdcd =πr aeae =r r = 1 (= 57.29 . .○. ○ ) . ○ ○) , ○ ○) , P = = = (= 90 Q = = π (= 180 R = P = r = r = 2(= 90 ) , Q = r = r = π (= 180 ) , R = r= r= 1 (= 57.29 . . . ) . r r 2 r r r r

Page 248, Table of Contents Page 248, Table of Contents

www.EconsPhDTutor.com www.EconsPhDTutor.com

26.3

Angles - Acute, Right, Obtuse, Straight, Reflex

Angles are given different names, depending on their size. θ is the zero angle if θ = 0,

π θ is an obtuse angle if θ ∈ ( , π), 2

π θ is an acute angle if θ ∈ (0, ), 2 θ is a right angle if θ =

θ is a straight angle if θ = π,

θ is a reflex angle if θ ∈ (π, 2π).

π , 2

In the figure below, the angle A is acute, R is right, O is obtuse, S is straight, and X is reflex. The zero angle is not depicted. By convention, every angle is depicted as a sector of a circle, unless it is a right angle, in which case it is depicted by a square.

A

O

R

Page 249, Table of Contents

www.EconsPhDTutor.com

26.4

Triangles - Acute, Right, Obtuse

Triangles are also given different names, depending on the size of their largest angle. A triangle is: • Acute if its largest angle is acute; • Right if its largest angle is right; and • Obtuse if its largest angle is obtuse. In the figure below, the largest angle of each triangle is highlighted.

Obtuse triangle

Acute triangle Right triangle

Page 250, Table of Contents

www.EconsPhDTutor.com

26.5

Defining the Trigonometric Functions

Defining the known Trigonometric Functions he trigonometric26.5 functions — also as the circular functions — are sin sine, tangent, cosecant, secant, and cotangent (respectively denoted sin, cos, ta trigonometric functions — also known as the circular functions — are sine, c, sec,The and cot).

cosine, tangent, cosecant, secant, and cotangent (respectively denoted sin, cos, tan, sec,circle. and cot). aw acsc, unit Pick any point p = (px , py ) on the unit circle. Let A be the angle th

e lineDraw segment makes withpoint theppositive Note that a unit op circle. Pick any = (px , py ) x-axis. on the unit circle. Let the A beline the segment angle that op h line segment with positive x-axis. thattrigonometric the line segment op has ma ngth the 1, because it is op themakes radius of the a unit circle. EachNote of the functions length 1, because it is number. the radius of a unit circle. Each of the trigonometric functions maps e angle A to some real the angle A to some real number.

Informally, is acute (the pointppisis in in the quadrant of the plane), plan ormally, wherewhere A is Aacute (the point thetop-right top-right quadrant ofcartesian the cartesian thecosine, sine, cosine, and tangent functionsare are defined defined using mnemonic “SOH, CAH, CAH, TOA” TO e sine, and tangent functions usingthethe mnemonic “SOH, is Opposite over Hypothenuse, Cosine is Adjacent over Hypothenuse, and Tangent is ne is(Sine Opposite over Hypothenuse, Cosine is Adjacent over Hypothenuse, and Tangent pposite over Adjacent). And the functions cosecant, secant, and cotangent are simp finedPage as 251, theTable reciprocals (respectively). of Contents of the sine, cosine, and tangent functions www.EconsPhDTutor.com

Opposite over Adjacent). And the functions cosecant, secant, and cotangent are simply defined as the reciprocals of the sine, cosine, and tangent functions (respectively). Formally: Definition 60. The sine, cosine, tangent, cosecant, secant, and cotangent functions are real-valued functions defined to have the following domains and mapping rules. Also listed are their ranges. sin

Domain R

cos

R

tan ⋅ ⋅ ⋅ ∪ ( csc

⋅ ⋅ ⋅ ∪ (−π, 0) ∪ (0, π) ∪ (π, 2π) . . .

sec ⋅ ⋅ ⋅ ∪ (

cot

−3π −π −π π π 3π , ) ∪ ( , ) ∪ ( , )... 2 2 2 2 2 2

−3π −π −π π π 3π , ) ∪ ( , ) ∪ ( , )... 2 2 2 2 2 2

⋅ ⋅ ⋅ ∪ (−π, 0) ∪ (0, π) ∪ (π, 2π) . . .

Page 252, Table of Contents

Range [−1, 1] [−1, 1] R

R/(−1, 1) R/(−1, 1) R

Mapping Rule A ↦ py A ↦ px

A↦

py px

A↦

1 px

A↦

A↦

1 py

px py

www.EconsPhDTutor.com

Sine fluctuations as as being being sinusinuSine and and cosine cosine fluctuate fluctuate between between −1 and 1. We describe their fluctuations soidal. half-integer multiples multiples of of π, π, soidal. In In contrast, contrast, tangent tangent fluctuates between −∞ and ∞. At half-integer the tangent tangent function function is is undefined. undefined. the

You don’t don’t need need to to memorise memorise the following (because you have a calculator). You calculator). But But you you will will solve problems problems aa little little more more quickly if you have these memorised. solve

xx

00

sin sinxx 00 cos cosxx 11 tanxx 00 tan

ππ 66

π 4

π 3

11//22

√ √ 22/22

√ 3/2

√ √ 33//22 √ √ 33//33

√ √ 22/22

1

1/2

π 2 1 0

2π 3

3π 4

5π 5π 66

√ 3/2

√ √ 22/22

11//22

−1/2

√ √ 3 Undefined − 3

√ √

√ √

−1

−− 33//33

ππ 00

−1 − 22/22 −− 33//22 −1 √ √

00

Exercise following equations: equations: yy==csc cscx,x,y y= =sec secx,x,and andy =y cot = cot (Answer Exercise 102. 102. Graph the following x. x.(Answer on on p. p. 1001.) 1001.) Page 253, Table of Contents

www.EconsPhDTutor.com

26.6

Formulae for Sine, Cosine, and Tangent

For all x for which all expressions are well defined, we have the following formulae: sin(−x) = − sin x,

cos(−x) = cos x,

sin(x + 2π) = sin x, cos(x + 2π) = cos x,

tan(−x) = − tan x,

tan(x + 2π) = tan x.

The following formulae will appear in the List of Formulae you’ll get during exams, so you don’t need to memorise them. Exam Tip: Whenever you see a question with trigonometric functions, make sure you have this list right next to you! For all A, B, P, Q for which all expressions are well-defined, we have: sin(A ± B) = sin A cos B ± cos A sin B,

cos(A ± B) = sin A cos B ∓ cos A sin B,

tan(A ± B) =

tan A ± tan B , 1 ∓ tan A tan B

sin 2A = 2 sin A cos A,

cos 2A = cos2 A − sin2 A = 2 cos2 A − 1 = 1 − 2 sin2 A,

tan 2A =

2 tan A , 1 − tan2 A

sin P + sin Q = 2 sin (

P +Q P −Q ) cos ( ), 2 2

sin P − sin Q = 2 cos (

cos P + cos Q = 2 cos (

P +Q P −Q ) sin ( ), 2 2

P +Q P −Q ) cos ( ), 2 2

cos P − cos Q = −2 sin (

Page 254, Table of Contents

P +Q P −Q ) sin ( ). 2 2

www.EconsPhDTutor.com

26.7

Arcsine, Arccosine, Arctangent

We define sin2 A to be the square of sin A. One might thus suppose that analogously, sin−1 x = 1/ sin x, but this is not so! Instead:

Definition 61. The arcsine function, denoted sin−1 , has domain [−1, 1], codomain (and range) [−0.5π, 0.5π], and rule x ↦ y where sin y = x. Below is the graph of the arcsine function. The endpoints (−1, −0.5π) and (1, 0.5π) are marked with red dots.

y 0.5π y = sin-1 x

x -1.0

-0.6

-0.2

0.2

0.6

1.0

-0.5π

Page 255, Table of Contents

www.EconsPhDTutor.com

A remark about principal values. We refer to [−0.5π, 0.5π] as the principal values of the arcsine function. What does this mean? π Angles come full circle every 2π radiant. And so for example, sin = 0.5. But also 6 π π π sin ( + 2π) = 0.5. And also sin ( + 4π) = 0.5. And also sin ( − 2π) = 0.5. Indeed, 6 6 6 π sin ( + 2kπ) = 0.5 for any k ∈ Z. We say that the sine function is periodic. 6 π Yet we do not say that sin−1 (0.5) = + 2kπ for any k ∈ Z — because this would mean that 6 sin−1 maps each element in the domain to more than one (indeed infinitely many) elements in the codomain. And so sin−1 wouldn’t be a function.

Instead, we define the arcsine function so that its principal values are [−0.5π, 0.5π]. That π is, the codomain of the arcsine function is [−0.5π, 0.5π]. And thus, sin−1 (0.5) = . 6

Note that the choice of [−0.5π, 0.5π] as the principal values of the arcsine function is a somewhat arbitrary convention. We could equally well have chosen, say, [0.5π, 1.5π] as our principal values. It’s nicer though that our principal values are centred on 0.

Page 256, Table of Contents

www.EconsPhDTutor.com

Definition 62. The arccosine function, denoted cos−1 , has domain [−1, 1], codomain (and range) [0, π], and rule x ↦ y where cos y = x.

Below is the graph of the arccosine function. The endpoints (−1, π) and (1, 0) are marked with blue dots.

Note that [0, π] are the principal values of the arccosine function. Why can’t we select [−0.5π, 0.5π] as the principal values for the arccosine function, like we did for the arcsine function?34

y π

y = cos-1 x

-1.0

34

-0.6

-0.2

0.2

0.6

1.0 x

Because then cos−1 (−1), for example, would be undefined.

Page 257, Table of Contents

www.EconsPhDTutor.com

Definition 63. The arctangent function, denoted tan−1 , has domain R, codomain (and range) (−0.5π, 0.5π), and rule x ↦ y where tan y = x.

Below is the graph of the arctangent function. There are two horizontal asymptotes, namely y = 0.5π and y = −0.5π. That is, as x → ±∞, y → ±0.5π.

Note that (−0.5π, 0.5π) are the principal values of the arctangent function.

y y = 0.5π horizontal asymptote x -10

-6

y=

tan-1

-2

x

2

6

10

y = -0.5π horizontal asymptote

Remark 7. This notation can be tremendously confusing, which is why many writers prefer to write arcsin x, arccos x, and arctan x instead of sin−1 x, cos−1 x, tan−1 x. But the Singapore Cambridge A-level syllabus does not use the arcsin x, arccos x, or arctan x notation and so neither shall this textbook. Page 258, Table of Contents

www.EconsPhDTutor.com

26.8

The Law of Sines and the Law of Cosines

Consider a triangle with sides of lengths a, b, and c and angles A, B, and C.

B

c

a a sin C

C

A b - a cos C

a cos C b

Proposition 4. A triangle with sides of lengths a, b, and c and angles A, B, and C has area is 0.5ab sin C. Proof. The triangle has base b and height a sin C. Hence, its area is 0.5ab sin C.

Page 259, Table of Contents

www.EconsPhDTutor.com

Proposition 5. (The Law of Sines.) For a triangle with sides of lengths a, b, and c and angles A, B, and C, b c a = = . sin A sin B sin C Proof. The area of the above triangle is 0.5ab sin C. By symmetry, it is also 0.5bc sin A and 0.5ac sin B. Equate these and divide by 0.5abc: 0.5ab sin C = 0.5bc sin A = 0.5ac sin B ⇐⇒

a b c = = . sin A sin B sin C

Proposition 6. (The Law of Cosines.) For a triangle with sides of lengths a, b, and c and angles A, B, and C, c2 = a2 + b2 − 2ab cos C. Proof. (Optional.) By the Pythagorean Theorem, c2 = (a sin C)2 + (b − a cos C) = a2 sin2 C + b2 − 2ab cos C + a2 cos2 C 2

= a2 (sin2 C + cos2 C) + b2 − 2ab cos C = a2 + b2 − 2ab cos C,

where the last line uses the identity sin2 C + cos2 C = 1.

One perhaps-obvious implication of the Law of Cosines is that the length of any one side of a triangle is always less than the sum of the lengths of the other two sides. Corollary 3. For a triangle with sides of lengths a, b, and c, a < b + c.

Proof. c2 = a2 +b2 −2ab cos C = a2 +b2 −2ab+2ab−2ab cos C = (a−b)2 +2ab(1−cos C) > (a−b)2 . Hence, c > a − b or a < b + c. Page 260, Table of Contents

www.EconsPhDTutor.com

27

Vectors in Two Dimensions (2D)

Recall that a point is simply an ordered pair of real numbers. Example 226. The points a = (−1, 2), b = (3, −1), c = (−1, 1), and d = (3, −2) can be illustrated graphically on the cartesian plane. The origin (0, 0) is usually named o.

3

y

a 2 c 1

0 -3

-2

-1

x

o 0

1

2

3

4

5

b -1

-2

d

-3

We now introduce an entirely new mathematical object, called a vector. We will not formally define vectors, because to do so would require more maths than is covered at A-level. But informally, a vector is an “arrow” with two properties: direction and length.

Page 261, Table of Contents

www.EconsPhDTutor.com

Ð → Ð → Example 227. In the figure, ab, cd, and u are all vectors. (As we’ll see, there are multiple ways to denote vectors.)

3

y 4

2

a

The vector ab = v = v 3 Length = 5 1 c

0 -3

-2

-1

0

1

2

-1 The vector cd = v = v -2

3

4

5 x

b The vector u d

-3

Ð → Given two points a and b, ab denotes the vector from point a to point b. The word vector means carrier (in Latin). You may have learnt in biology that mosquitoes are vectors, because they carry diseases (to humans). In mathematics likewise, a vector carries us from one point to another. Ð → Ð → Example 228. The vector ab carries us from point a to point b. The vector cd carries us from point c to point d.

Page 262, Table of Contents

www.EconsPhDTutor.com

Like a point, a vector can be described as being an ordered pair of real numbers Ð → Example 229. The vector ab = (4, −3) carries us 4 units to the right and 3 units down. The Ð → vector cd = (4, −3) carries us 4 units to the right and 3 units down. The vector u = (2, −1.5) carries us 2 units to the right and 1.5 units down. Note that we’re now using the (x, y) ordered set notation for the third time!35

Do not confuse a point with a vector!

Example 230. The point (4, −3) is a zero-dimensional object. In contrast, the vector (4, −3) is a two-dimensional object. The vector (x, y) can also be written as Example 231. We can write (4, −3) =

⎛x⎞ . ⎝y ⎠

⎛ 4 ⎞ ⎛ 2 ⎞ and u = (2, −1.5) = . ⎝ −3 ⎠ ⎝ −1.5 ⎠

⎛a⎞ notation for vectors is very useful, because as we’ll see shortly, we’ll be doing a ⎝ b ⎠ lot of addition and multiplication with vectors, and this notation can help us see better (in a literal sense). But in print, I’ll often prefer using the (a, b) notation, simply because this takes up less space. The

The point a is called the vector’s tail and the point b is called the vector’s head. This is potentially confusing, so always remember: a vector carries us from tail to head and not the other way round! A vector is defined by two characteristics: direction and length. It must be stressed that the tail and head of a vector do not matter. Only the direction and length do. So long as two vectors have the same direction and length, they are considered to be exact same vector. Examples to illustrate

35

So far, we have used (x, y) to denote (i) an open interval — specifically, the set of real numbers greater than x but smaller than y; (ii) the ordered pair of real numbers x and y; and now also (iii) the vector that carries us x units to the right and y units up.

Page 263, Table of Contents

www.EconsPhDTutor.com

Ð → Ð → Ð → Ð → Example 232. Informally, ab, cd, and u all point in the same direction. ab √ and cd have the same length, which we can compute using the Pythagorean Theorem as 32 + 42 = 5. Ð → Ð → Ð → Ð → Hence, ab and cd are considered to be exactly the same: ab = cd. Even though they have Ð → Ð → different heads and tails, both ab and cd carry us 4 units right and 3 units down. The Ð → Ð → vector (4, −3) can carry us from a to b or from c to d. Thus, cd = (4, −3) = ab = (4, −3). They are one and the same vector. Ð → Ð → In contrast, the vector u has only half the length of ab and so u ≠ ab. (Indeed, as we shall Ð → learn later, we can write u = 0.5ab.) Example 233. The vector (0, −1) can carry us from a to c or from b to d. Thus, Ð → → = (0, −1). Thus, bd = (0, −1) = Ð ac But, and

Ð → = (0, 1) ≠ Ð → = (0, −1), ca ac → = (0, −1). (0, −0.5) ≠ Ð ac

Yet another way of denoting vectors is by a single letter, either with a right arrow overhead Ð → Ð → or in bold font. For example, in the figure above, the vector ab or cd is also named using → the letter v, either as Ð v or as bold-font v. Ð → Example 234. So altogether, I can write the vector ab in five different ways: ⎛ 4 ⎞ Ð → → ab = Ð v = v = (4, −3) = . ⎝ −3 ⎠

→ Given a choice between writing Ð v or v, the bold font v is preferred in print publications. → But in handwriting, most people prefer Ð v (because writing in bold font is hard).

Exercise 103. Using a, b, c, or d from the above figure as the tail and a distinct point as the head, there are 12 possible vectors. We’ve already written out 4 of these in the last two examples. Write out the other 8 in ordered set notation. (Answer on p. 1002.)

Page 264, Table of Contents

www.EconsPhDTutor.com

The position vector of a point a is simply the vector from the origin o = (0, 0) to the point a. Formally: Definition 64. Given a point a = (a1 , a2 ), its position vector is the vector a = (a1 , a2 ).

The position vector of the point a carries us from the origin o to the point a and so it → Take care not to confuse the point a = (a , a ) with the vector can also be denoted Ð oa. 1 2 a = (a1 , a2 ) — they are different objects! Informally, the zero vector is the vector that carries us nowhere. Formally: Ð → Definition 65. The zero vector is the vector (0, 0) and can be denoted 0 or 0 .

Page 265, Table of Contents

www.EconsPhDTutor.com

27.1

Sum and Difference of Points and Vectors

Here is a quick summary of what you’ll learn in this section. (1) (2) (3) (4)

Point Point Point Point

+ Point = Undefined, − Point = Vector, + Vector = Point, − Vector = Point.

1. Point + Point = Undefined

If a and b are points, then there is no such thing as a + b.36

The analogy is to points in the real world — it makes no sense to talk about the sum of two locations: Example 235. Consider the points Paris and Tokyo. The sum Paris + Tokyo = ?? is undefined. It makes no sense to talk about the sum of two locations.

p+v

q

v u b–a p

36

a

b q–u

At least in this textbook (and in the A-levels).

Page 266, Table of Contents

www.EconsPhDTutor.com

2. Point − Point = Vector Definition 66. Given two points a = (a1 , a2 ) and b = (b1 , b2 ), their difference b−a is defined to be the vector from a to b, i.e., b − a = (b1 − a1 , b2 − a2 ).

Example 236. Paris − Tokyo = The journey that carries us from Tokyo to Paris. We might write Paris − Tokyo =(−9000 km, 1000 km), meaning that to get from Tokyo to Paris, we must travel 9, 000 km west and 1, 000 km north. It makes sense to talk about the distance of the journey from Tokyo to Paris. Shortly, we’ll see that it similarly makes sense to talk about the length of the vector from a to b. Example 237. (See figure on p. 262.) Given the points a = (−1, 2) and b = (3, −1), their difference b − a is the vector from a to b, i.e., b − a = (3 − (−1), −1 − 2) = (4, −3). 3. Point + Vector = Point Definition 67. Given the point p = (p1 , p2 ) and the vector v = (v1 , v2 ), their sum p + v is defined to be the point p + v = (p1 + v1 , p2 + v2 ). Geometrically, if the vector v has tail p, then it also has head p + v.

Example 238. Tokyo + (−9000 km, 1000 km) = Paris. This says that starting from Tokyo, if we embark on a journey that carries us 9, 000 km west and 1, 000 km north, then we’ll end up in Paris. Example 239. (See figure on p. 262.) Consider the vector (4, −3). If its tail is a = (−1, 2), then its head is (−1, 2) + (4, −3) = (3, −1) = b. And if its tail is c = (−1, 1), then its head is (−1, 1) + (4, −3) = (3, −2) = d.

Page 267, Table of Contents

www.EconsPhDTutor.com

4. Point − Vector = Point Definition 68. Given the point q = (q1 , q2 ) and the vector u = (u1 , u2 ), their difference q − u is defined to be the point q − u = (q1 − u1 , q2 − u2 ). Geometrically, if the vector u has head q, then it also has tail q − u.

Example 240. Paris − (−9000 km, 1000 km) = Tokyo. This says that starting from Paris, if we embark on a journey that is the exact opposite of going 9, 000 km west and 1, 000 km north (equivalently, we embark on a journey that goes 9, 000 km east and 1, 000 km south), then we’ll end up in Tokyo. Example 241. (See figure on p. 262.) Consider again the vector (4, −3). If its head is b = (3, −1), then its tail is (3, −1) − (4, −3) = (−1, 2) = a. And if its head is d = (3, −2), then its tail is (3, −2) − (4, −3) = (−1, 1) = c. Exercise 104. Consider the vector (4, −3). (a) If it has tail (0, 0), then what is its head? (b) If it has head (0, 0), then what is its tail? (c) If it has tail (5, 2), then what is its head? (d) If it has head (5, 2), then what is its tail? (Answer on p. 1002.)

Page 268, Table of Contents

www.EconsPhDTutor.com

27.2

Sum, Additive Inverse, and Difference of Vectors

Here is a quick summary of what you’ll learn in this section. (1) Vector + Vector = Vector, (2) − Vector = Vector, (additive inverse) (3) Vector − Vector = Vector. 1. Vector + Vector= Vector Definition 69. If u = (u1 , u2 ) and v = (v1 , v2 ) are vectors, then their sum, denoted u + v, is the vector defined by u + v = (u1 + v1 , u2 + v2 ).

Geometrically, if the tail of v is the head of u, then u + v is the vector from the tail of u to the head of v.

u+v

v

u

Ð → Ð → → Example 242. (See figure on p. 262.) ab + bc = (4, −3) + (−4, 2) = (0, −1) = Ð ac. Ð → Ð → Example 243. (See figure on p. 262.) ad + cb = (4, −4) + (4, −2) = (8, −6).

Page 269, Table of Contents

www.EconsPhDTutor.com

2. − Vector= Vector (Additive inverse) Definition 70. If v = (v1 , v2 ), then its additive inverse, denoted −v, is defined by −v = (−v1 , −v2 ).

Geometrically, if the vector v is from point a to point b, then −v is the vector from point b to point a. And so informally, the additive inverse is simply the same vector but flipped in the opposite direction. Ð → Ð → Ð → Ð → Example 244. The additive inverse of ab is ba. That is, −ab = ba. Ð → Ð → Ð → Ð → Example 245. The additive inverse of bc is cb. That is, − bc = cb. 3. Vector − Vector= Vector Definition 71. Given two vectors u and v, their difference, denoted u − v, is defined to be the sum of the vectors u and −v. Or equivalently, if u = (u1 , u2 ) and v = (v1 , v2 ), then u − v is the vector defined by u − v = (u1 − v1 , u2 − v2 ). Geometrically, if we place the heads of u and v at the same point, then u − v is the vector from the tail of u to the tail of v.

u

u-v

Page 270, Table of Contents

v

www.EconsPhDTutor.com

→ could be written as the In the previous section, we learnt that by definition, the vector Ð pq → = q − p. Now, we’ll prove that Ð → can also be written as the difference of two points: Ð pq pq difference of two vectors: → = q − p. Fact 19. Let p and q be two points with position vectors p and q. Then Ð pq

→ + (−Ð → =Ð →+Ð →=Ð →+Ð → This is thus the vector that carries Proof. q − p = q + (−p) = Ð oq op) oq po po oq. us first from p to o, then from o to q; in short, it carries us from p to q. So it is simply the → vector Ð pq. Ð → Example 246. (See figure on p. 262.) b − a = (3, −1) − (−1, 2) = (4, −3) = ab. Ð → Example 247. (See figure on p. 262.) d − c = (3, −2) − (−1, 1) = (4, −3) = cd.

Interpreting u − v as the sum of the vectors u and −v is often convenient:

Ð → Ð → Example 248. (See figure on p. 262.) Without any numbers, we can compute: ab − cb = Ð → Ð → Ð → Ð → → Ð → → We can verify with numbers that this is correct: Ð ab + (− cb) = ab + bc = Ð ac. ab − cb = → ✓. (4, −3) − (4, −2) = (0, −1) = Ð ac Ð → Ð → Example 249. (See figure on p. 262.) ad − cb = (4, −4) − (4, −2) = (0, −2).

→ Ð → → Ð → Ð → Ð → Ð → Ð → Ð → Ð → Ð → →+Ð Exercise 105. Write down what Ð ac cb, dc + Ð ca, bd + da, ad − cd, −dc − bd, and bd + db are, without writing out any numbers.(Answer on p. 1002.) → Ð → → →−Ð Exercise 106. Using the figure on p. 262, compute each of the following: Ð ac cb, dc − Ð ca, Ð → Ð → Ð → Ð → Ð → Ð → Ð → Ð → bd − da, ad + cd, dc + bd, and bd − db? (Answer on p. 1002.)

Page 271, Table of Contents

www.EconsPhDTutor.com

27.3

Displacement Vectors

Ð → Definition 72. If a moving particle starts at point a and ends at point b, we call ab its displacement vector.

Example 250. A particle is travelling along the red arc, along the path shown. Its starting point is in blue and its ending point is in purple. Its displacement vector is thus (2, 2).

1

y x

0 -1

0 -1 -2

2 Ending point

1

3

4

Displacement vector (2, 2) Starting point

-3 -4

Page 272, Table of Contents

www.EconsPhDTutor.com

27.4 Length (or Magnitude) of a Vector 27.4 Length (or Magnitude) of a Vector The Pythagorean Theorem says that if c is the The Theorem says if cc is the The Pythagorean Pythagorean Theorem saysofthat that ifother is two the triangle and a, b are the lengths the triangle triangle and and a, a, bb are are the lengths of the other two

length of hypothenuse of a right-angled length of of hypothenuse hypothenuse of aa right-angled right-angled length 2 2 2 of sides, then a + b = c . 2 2 2 sides, then a2 + + b2 == c2 ..

As the distance distance between between two twopoints pointsusing using Asyou youlearnt learntin insecondary secondary school As you learnt in secondary school we we can can calculate calculate the between two points using the thePythagorean PythagoreanTheorem: Theorem: the Pythagorean Theorem: Example 251. two points. points. Then Then the the distance distancebetween betweenppp Example 251.Let Letpp == (1, be two Example Let (1,1) 1) and and qq == (−1, (−1, −1) −1) be distance between √ √251. √ √ √ √ √ 22 2 and andqqqisis is [1 [1−−−(−1)] (−1)]2++[1 [1−−(−1)] (−1)]2 == 44 + + 44 = = 8. 8. and [1 (−1)]

The vector vector vv == (v (v1,, vv2)) goes goes v1 units units right right and and vv2 units We are thus motivated to define The units up. up. We are thus motivated to define 1, v22) goes vv11 units 2 units The vector v = (v right and v up. We are thus motivated to define 1 2 its length (or magnitude) as: its length (or magnitude) as: its length (or magnitude) as: Page 273, Table of Contents Page273, 273,Table TableofofContents Contents Page

www.EconsPhDTutor.com www.EconsPhDTutor.com www.EconsPhDTutor.com

Definition √ 73. The length (or magnitude) of a vector v = (v1 , v2 ) is denoted ∣v∣ and defined by ∣v∣ = v12 + v22 . Example 251 (continued). Another way to find the distance between p and q is to first → = (−2, −2). The distance between p find the vector that carries us from p to q. This is Ð pq √ √ 2 2 Ð → and q is thus simply the length (or magnitude) of this vector: ∣pq∣ = (−2) + (−2) = 8.

Of course, the distance from p to q is the same as the distance from q to p. So we could → = → = (2, 2) and gotten the same answer – ∣Ð qp∣ just as well have calculated the length of Ð qp √ √ 22 + 22 = 8. Exercise 107. (Answer on p. 1002.) Using the figure on p. 262, compute each of the following: → Ð → → Ð → Ð → Ð → Ð → Ð → Ð → Ð → Ð → →−Ð ∣Ð ac cb∣, ∣dc − Ð ca∣, ∣bd − da∣, ∣ad + cd∣, ∣dc + bd∣, and ∣bd − db∣.

Also, find the distance between (18, 4) and (−1, −2).

Exercise 108. (Answer on p. 1002.) In general, given any two vectors u and v, is it true that ∣u + v∣ = ∣u∣ + ∣v∣?

Page 274, Table of Contents

www.EconsPhDTutor.com

27.5

Scalar Multiplication of a Vector

Definition 74. A scalar is simply any real number. A scalar is often contrasted with a vector. A vector has both magnitude (or length) and direction. In contrast, a scalar has magnitude but no direction. Definition 75. If v = (v1 , v2 ) is a vector and c ∈ R is a scalar, then cv denotes the vector defined by cv = (cv1 , cv2 ). We call this operation scalar multiplication of a vector.

Graphically, cv is simply the vector that has the same direction as v, but with c times the length. This is formally shown in the next fact.

v cv

Fact 20. If v = (v1 , v2 ) is a vector and c ∈ R, then ∣cv∣ = ∣c∣ ∣v∣. Proof.

√ 2 2 ∣cv∣ = ∣(cv1 , cv2 )∣ = (cv1 ) + (cv2 ) √ √ 2 2 2 2 = c v1 + c v2 = ∣c∣ v12 + v22 = ∣c∣ ∣(v1 , v2 )∣ = ∣c∣ ∣v∣ .

Ð → → Ð → Exercise 109. Using the figure on p. 262, write down 2ab, 3Ð ac, and 4ad in ordered set → Ð → Ð → Ð → → = 3 ∣Ð → and ∣4Ð notation. Verify that ∣2ab∣ = 2 ∣ab∣, ∣3Ð ac∣ ac∣, ad∣ = 4 ∣ad∣. (Answer on p. 1003.)

Page 275, Table of Contents

www.EconsPhDTutor.com

27.6

Unit Vectors

Definition 76. A unit vector is any vector of length 1. √

√ 2 2 Example 252. Let’s verify that the vectors (1, 0), (0, 1), and ( , ) are all unit vectors: 2 2 √ ∣(1, 0)∣ = 12 + 02 = 1, ✓ √ ✓ ∣(0, 1)∣ = 02 + 12 = 1, ¿ √ √ √ 2 Á √ 2 √ Á 2 2 2 2 À( ∣( , )∣ = Á ) +( ) = 2/4 + 2/4 = 1. ✓ 2 2 2 2 Example 253. Let’s verify that the vectors (1, 1) and (−1, −1) are not unit vectors: √ √ 12 + 12 = 2 ≠ 1, ✓ √ √ 2 2 ∣(−1, −1)∣ = (−1) + (−1) = 2 ≠ 1. ✓ ∣(1, 1)∣ =

Ð → We specially reserve the name i (or i ) for the unit vector (1, 0), which is the unit vector that is purely in the direction of the x-axis. Similarly, we specially reserve the name j (or Ð → j ) for the unit vector (0, 1), which is the unit vector that is purely in the direction of the y-axis.

And so, using also what we learnt about the sum of and scalar multiplication of vectors, we can rewrite any vector into the sum of i’s and j’s:

Page 276, Table of Contents

www.EconsPhDTutor.com

Example 254. The position vectors for the points a, b, and c (illustrated below) are a = (1, 2) = i + 2j, b = (4, −3) = 4i − 3j, and c = (0, 6) = 6j.

7

y c

6 j 5 j 4 j 3 j

a

2 ji 1 j

x

0 -3

-2

-1 -1 -2 -3

0-j

2

3

4

5

-j -j i

Page 277, Table of Contents

1

b i

i

i

www.EconsPhDTutor.com

ˆ — points in the same direction Informally, the unit vector in the direction v — denoted v v, but has length 1. Formally: Definition 77. The unit vector in the direction v is defined by v ˆ=

1 v. ∣v∣

Ð → → Exercise 110. In the figure on p. 262, what are the unit vectors in the directions ab, Ð ac, Ð → Ð → Ð Ð → → and ad? What are the unit vectors in the directions 2ab, 3ac, and 4ad? (Answer on p. 1003.) The following fact is an obvious corollary to Fact 20. Fact 21. If c is a scalar and v ˆ is a unit vector, then the vector cˆ v has length c. Informally, two vectors have the same unit vector ⇐⇒ they both point in the same direction. Formally: ˆ ⇐⇒ a can be written as a scalar ˆ=b Fact 22. Let a and b be any two vectors. Then a multiple of b. Proof. Optional, see p. 856 in the Appendices. Informally, any vector in the plane can be written as the linear combination of any other two vectors. Formally: Fact 23. Let a and b be any two vectors in the same plane with distinct directions (i.e. ˆ Then every vector in the same plane can be written as αa + βb for some α, β ∈ R. ˆ ≠ b). a Proof. Optional, see p. 856 in the Appendices.

See TYS Exercise 339 (i) for an application of the above fact. Exercise 111. Given the vectors a = (1, 3) and b = (7, 5), show that each of the following vectors can be written in the form αa + βb for some α, β ∈ R. (i) (0, 1). (ii) (1, 0). (iii) (1, 1). (Answer on p. 1003.) Page 278, Table of Contents

www.EconsPhDTutor.com

27.7 27.7

The Ratio Theorem Theorem The Ratio

Theorem and pp be be points, points, where whereppisison onthe theline linesegment segment Theorem3.3.Ratio Ratio Theorem. Theorem. Let Let a, a, b, b, and ab. vectors. Then Then ab.Let Leta,a,b,b,and andppbe be the the corresponding corresponding position position vectors. Ð → ∣bp∣

Ð → → ap∣ ∣∣Ð ap∣ b. pp == → a+ Ð Ð → b. → → + ∣Ð Ð → ++∣∣Ð Ð → ∣∣ap∣ ap∣ bp∣ ∣∣Ð ap∣ bp∣ ap∣ bp∣ Proof.Optional, Optional,see seep. p. 858 858 (Appendices). (Appendices). Proof. Ð → Ð → and µ = ∣Ð → Ð → ∣ Or if we let λ = ap∣ bp∣, then the the above above can Or if we let λ = ∣ap∣ and µ = ∣bp∣, then can be be rewritten rewrittenininaaform formthat thatisisperhaps perhaps easier to remember: easier to remember: λb µa + λb µa p = µa + λb = µa + λb. p= λ+µ + λ+µ = λ+µ . λ+µ λ+µ λ+µ

By the way, the List of Formulae (p. 4) contains this statement: By the way, the List of Formulae (p. 4) contains this statement: µa + λb “The point dividing AB in the ratio λ ∶ µ has position vector µa + λb .” “The point dividing AB in the ratio λ ∶ µ has position vector λ + µ .” λ+µ

Page 279, Table of Contents

www.EconsPhDTutor.com

Example 255. Consider the points a = (3, 4) and b = (−1, 2). Find the point p that divides the line segment ab into the ratio 3 ∶ 2.

We have

2 3 2 3 3 14 p = a + b = (3, 4) + (−1, 2) = ( , ) . 5 5 5 5 5 5

3 14 Hence, the point is p = ( , ). 5 5

Example 256. Consider the points a = (8, 3) and b = (2, −6). Find the point p that divides the line segment ab into the ratio 3 ∶ 7.

We have

p = 0.7a + 0.3b = 0.7(8, 3) + 0.3(2, −6) = (6.2, 0.3) .

Hence, the point is p = (6.2, 0.3).

Exercise 112. (a) Consider the points a = (1, 2) and b = (3, 4). Find the point p that divides the line segment ab into the ratio 5 ∶ 6. (b) Consider the points a = (1, 4) and b = (2, 3). Find the point p that divides the line segment ab into the ratio 5 ∶ 1. (c) Consider the points a = (−1, 2) and b = (3, −4). Find the point p that divides the line segment ab into the ratio 2 ∶ 3. (Answer on p. 1003.)

Page 280, Table of Contents

www.EconsPhDTutor.com

28

Scalar Product

Definition 78. Given two 2D vectors u = (u1 , u2 ) and v = (v1 , v2 ), their scalar product (or dot product), denoted u ⋅ v, is defined by u ⋅ v = u1 v1 + u2 v2 . And so to get the scalar product, simply multiply each term of each vector with the corresponding term of the other, then add these up. It’s that simple! The scalar product is itself simply a scalar (i.e. a real number). Hence the name. Example 257. (5, −3) ⋅ (2, 1) = 5 × 2 + (−3) × 1 = 7.

Example 258. (0, 17) ⋅ (−1, 3) = 0 × (−1) + 17 × 3 = 51. Ordinary multiplication is distributive:

Example 259. 3 × (5 + 11) = 3 × 5 + 3 × 11 and 18 × (7 − 31) = 18 × 7 − 18 − 31.

It turns out that the scalar product is likewise distributive:

Fact 24. Let a, b, and c be vectors. Then a ⋅ (b + c) = a ⋅ b + a ⋅ c and (a + b) ⋅ c = a ⋅ c + b ⋅ c. Proof. Optional, see p. 857 in the Appendices. Here is one use of the scalar product: the length of a vector is simply the square root of its scalar product with itself. Formally: Fact 25. Given a vector v, ∣v∣ =

√

v ⋅ v.

√ Proof. By Definition 73 (length of vector), ∣v∣ = v12 + v22 . By Definition 78 (scalar product), √ v ⋅ v = v1 v1 + v2 v2 = v12 + v22 . Hence, ∣v∣ = v ⋅ v.

Next up is a more important use of the scalar product: Page 281, Table of Contents

www.EconsPhDTutor.com

28.1

The Angle between Two Vectors

Fact 26. Let θ ∈ [0, π] be the angle between two non-zero vectors u and v. Then u ⋅ v = ∣u∣ ∣v∣ cos θ.

v

Ĭ u

Proof. Optional, see p. 857 (Appendices). The above fact37 gives us a very convenient way to calculate the angle between two vectors, because rearranging, we have: θ = cos−1 (

37

u⋅v ). ∣u∣ ∣v∣

We have two possible interpretations of the scalar product that are entirely equivalent. We can use either of these interpretations as our definition and then prove that the other interpretation is true. (1) In this textbook, we first define the scalar product by u ⋅ v = u1 v1 + u2 v2 , then prove that u ⋅ v = ∣u∣ ∣v∣ cos θ. That is, we start with the algebraic definition, then prove a geometric property. (2) In contrast, others may prefer to first define the scalar product by u⋅v = ∣u∣ ∣v∣ cos θ, then prove that u⋅v = u1 v1 +u2 v2 . That is, we start with the geometric definition, then prove an algebraic property. Either way, we first define the scalar product one way or the other. We then prove that the alternative statement is equivalent. (It is possible that your JC teachers take the second approach, rather than the first, as is done in this textbook. Or worse, your teachers simply leave you confused as to why the hell u ⋅ v = ∣u∣ ∣v∣ cos θ and at the same time, magically enough, u ⋅ v = u1 v1 + u2 v2 . This was my experience as a JC student a number of years ago. If this is also your current experience, hopefully this textbook has helped to clear things up!)

Page 282, Table of Contents

www.EconsPhDTutor.com

Example 260. The vector i = (1, 0) points east. The vector (1, 1) points northeast. We π know the angle between these two vectors is . Let’s check and verify that the formula 4 works: θ = cos−1 (

i ⋅ (1, 1) (1, 0) ⋅ (1, 1) ) = cos−1 ( ) ∣i∣ ∣(1, 1)∣ ∣(1, 0)∣ ∣(1, 1)∣

⎛ ⎞ 1×1+0×1 ⎟ = cos−1 ⎜ √ √ 2 2 2 2 ( 1 + 0 ) × ( 1 + 1 ) ⎝ ⎠ = cos−1 (

1 π 1+0 √ ) = cos−1 ( √ ) = . 4 1× 2 2

✓

Example 261. The vector i = (1, 0) points east. The vector j = (0, 1) points north. We π know the angle between these two vectors is right (i.e. ). Let’s check and verify that the 2 formula works: θ = cos−1 (

i⋅j (1, 0) ⋅ (0, 1) ) = cos−1 ( ) ∣i∣ ∣j∣ ∣(1, 0)∣ ∣(0, 1)∣

⎛ ⎞ 1×0+0×1 ⎟ = cos ⎜ √ √ ⎝ ( 12 + 02 ) × ( 02 + 12 ) ⎠ −1

= cos−1 (

Page 283, Table of Contents

0+0 π ) = cos−1 0 = 1×1 2

✓

www.EconsPhDTutor.com

Example 262. The angle between the vectors (3, 2) and (−1, −4) is θ = cos−1 (

(3, 2) ⋅ (−1, −4) ) ∣(3, 2)∣ ∣(−1, −4)∣

⎞ 3 × (−1) + 2 × (−4) ⎟ ⎟ = cos ⎜ √ √ ⎜ 2 2 ⎟ 2 2 ⎝ ( 3 + 2 ) × ( (−1) + (−4) ) ⎠ ⎛

−1 ⎜

−3 − 8 −11 √ ) = cos−1 ( √ = cos−1 ( √ ) ≈ 2.404 13 × 17 221 This is an example where the angle is obtuse, i.e. between π/2 and π.

y (3, 2)

x 2.404 rad

(-1, -4)

Page 284, Table of Contents

www.EconsPhDTutor.com

Recall that the arccosine function is defined to have range [0, π]. That is, cos−1 x ∈ [0, π]. Moreover, π • x > 0 Ô⇒ cos−1 x ∈ [0, ), i.e. cos−1 x is an acute (or zero) angle. 2 π • x = 0 Ô⇒ cos−1 x = , i.e. cos−1 x is a right angle. 2 π • x < 0 Ô⇒ cos−1 x ∈ ( , π], i.e. cos−1 x is an obtuse (or straight) angle. 2 These three observations, together with Fact 26, imply the following Fact, which by the way was already illustrated by the previous three examples: Fact 27. Let u and v be vectors. The angle between u and v is (i) acute (or zero) if u ⋅ v < 0; (ii) right if u ⋅ v = 0; and

(iii) obtuse (or straight) if u ⋅ v > 0.

We’ll use the words perpendicular, orthogonal, and normal interchangeably: Definition 79. Two vectors are orthogonal (or perpendicular or normal) if the angle beπ tween them is right (i.e. equal to ). 2 I will sometimes write u ⊥ v to mean u is orthogonal (or perpendicular or normal) to v. Exercise 113. First write down the angle between each of the following pairs of vectors without using the above formula. Then verify that the formula does indeed √ give you these correct angles: (a) (2, 0) and (0, 17); (b) (5, 0) and (−3, 0); (c) i and (1, 3/3); (d) √ i and (1, 3). (Answers on pp. 1004 and 1005.) Exercise 114. Verify that i and j are orthogonal, by computing their scalar product. (Answer on p. 1006.)

Page 285, Table of Contents

www.EconsPhDTutor.com

28.2

Projection of One Vector on Another

The scalar product also gives a convenient way of computing the length of the projection of one vector on another. Say we have a right triangle (left diagram) where the angle θ and the length a are known. What is the length b? It is simply ∣a∣ cos θ.

a

a

Ĭ

Ĭ

b

a

b

b

Now suppose a (blue) and b (green) are vectors (right diagram). The projection of the vector a on the vector b is denoted a⊥b (red). Note that a⊥b is itself a vector. What is the length of the projection? Well, if ∣a∣ is the length of the vector a and θ is the angle between the two vectors, then the length of the projection is ∣a⊥b ∣ simply a cos θ.

Nicely enough, we actually have a quick alternative method of computing this length. Let ˆ be the unit vector for b. Then b ˆ = ∣a∣∣b∣ ˆ cos θ = ∣a∣ × 1 × cos θ = ∣a∣ cos θ = ∣a⊥b ∣. a⋅b

ˆ or more correctly ∣a ⋅ b∣, ˆ since a ⋅ b ˆ may sometimes So we have a nice interpretation for a ⋅ b be negative:

ˆ is simply the length of the projection of a on b! ∣a ⋅ b∣

Page 286, Table of Contents

www.EconsPhDTutor.com

Example 263. The length of the projection of (3, 2) on (1, 1) is ̂ (3, 2) ⋅ (1, 1) = (3, 2) ⋅ [

1 1 (1, 1)] = √ (3, 2) ⋅ (1, 1) ∣(1, 1)∣ 2

√ 5 5 2 1 = √ (3 × 1 + 2 × 1) = √ = . 2 2 2

You should verify for yourself that the length of the projection of (3, 2) on (1000, 1000) is √ 5 2 . The length of the vector to be projected — (3, 2) — matters, but the length of also 2 the vector onto which it is projected — be it (1, 1) or (1000, 1000) — doesn’t matter. Example 264. The length of the projection of (−6, 1) on (2, 0) is ̂ (−6, 1) ⋅ (2, 0) = (−6, 1) ⋅ [

1 1 (2, 0)] = (−6, 1) ⋅ (2, 0) ∣(2, 0)∣ 2

1 −12 = (−6 × 2 + 1 × 0) = = −6. 2 2

Again, you can verify for yourself that the length of the projection of (−6, 1) on (50000, 0) is also −6. Again, the length of the vector to be projected — (−6, 1) — matters, but the length of the vector onto which it is projected — be it (2, 0) or (50000, 0) — doesn’t matter. Exercise 115. What are the lengths of the projections of (a) (1, 0) on (33, 33) and (b) (33, 33) on (1, 0)? (Answer on p. 1006.)

Page 287, Table of Contents

www.EconsPhDTutor.com

28.3

Direction Cosines

The angle between a vector v and the x-axis is simply the angle between v and i = (1, 0). Similarly, the angle between v and the y-axis is simply the angle between v and j = (0, 1). Example 265. Consider the angle a between the vector (3, 2) and the x-axis. We have: α = cos a =

3 3×1+2×0 (3, 2) ⋅ (1, 0) =√ . = √ √ ∣(3, 2)∣ ∣(1, 0)∣ ( 32 + 22 ) × ( 12 + 02 ) 13

√ We refer to α = 3/ 13 as the x-direction cosine of the vector (3, 2). By computing √ cos−1 α = cos−1 (3/ 13) ≈ 0.588, we find that the angle a between the vector (3, 2) and the x-axis is 0.588.

y (3, 2) 0.983 rad x 2.404 rad

(-1, -4)

Page 288, Table of Contents

www.EconsPhDTutor.com

Example 266. Consider the angle b between the vector (3, 2) and the y-axis. We have: β = cos b =

2 (3, 2) ⋅ (0, 1) 3×0+2×1 2 √ =√ . =√ = √ √ ∣(3, 2)∣ ∣(0, 1)∣ ( 32 + 22 ) × ( 02 + 12 ) 13 × 1 13

√ We refer to β = 2/ 13 as the y-direction cosine of the vector (3, 2). By computing √ cos−1 β = cos−1 (2/ 13) ≈ 0.983, we find that the angle b between the vector (3, 2) and the y-axis is 0.983. Definition 80. Given a vector v, its x-direction cosine α is simply the length of the ˆ on the x-axis. projection of v ˆ on the y-axis. Similarly, its y-direction cosine β is simply the length of the projection of v The next Fact is immediate from the above definition: ˆ = (α, β). Fact 28. Let v be a vector and α and β be its x- and y-direction cosines. Then v Example 267. The x- and y-direction cosines of the vector (3, 2) are 3 α= √ 13

2 and β = √ . 13

3 2 Hence, the unit vector in the direction (3, 2) is ( √ , √ ). 13 13

Exercise 116. For each of the following vectors, find their x- and y-direction cosines. Hence write down their unit vectors. (a) (1, 3). (b) (4, 2). (c) (−1, 2). (Answer on p. 1006.)

Page 289, Table of Contents

www.EconsPhDTutor.com

29

Vectors in 3D

In two dimensions, we had the cartesian (or two-dimensional) plane with x- and y-axes. Informally, the x-axis goes to the right and the y-axis goes up. A point was any ordered pair of real numbers. The origin o = (0, 0) was the intersection point of the two axes. And relative to the origin, the generic point a = (a1 , a2 ) was the point a1 units to the right and a2 units up.

In three dimensions, we now instead have the three-dimensional space (3D space). The x- and y-axes are as before. There is an additional z-axis that, informally, comes “out of the paper, perpendicular to the plane of the paper, straight towards your face”.

We call this the right hand coordinate system, because if you take your right hand, stick out your thumb, forefinger, and middle finger so that they are perpendicular, your thumb represents the x-axis, your forefinger the y-axis, and your middle finger the z-axis. (Try it!) (If instead the z-axis goes “into the paper”, then we’d have a left hand coordinate system. Can you explain why?)

y

a2

a

x a1

a3 z Page 290, Table of Contents

www.EconsPhDTutor.com

In the context of 3D space, a point is any ordered triple of real numbers. The origin o = (0, 0, 0) is the point where the x-, y-, and z-axes intersect. And relative to the origin, the generic point a = (a1 , a2 , a3 ) is the point a1 units to the right, a2 units up, and a3 units “out of the paper”. Everything we learnt about 2D vectors finds its analogy in three-dimensional (3D) vectors. Most of the time, the analogy is obvious. Try these exercises. Exercise 117. (Answer on p. 1007.) (a) Fill in the blanks. A 3D vector is an “arrow” that has two characteristics: __________ and __________. Just like a point, it can be described by an __________ of __________. The vector a = (a1 , a2 , a3 ) carries us from the origin to _______________. (b) What other ways are there to denote the vector a = (a1 , a2 , a3 )? (Hint. The unit vector in the z-axis is now called k.)

Ð → (c) Let a = (a1 , a2 , a3 ) and b = (b1 , b2 , b3 ) be points. What are (i) a + b; (ii) a + ob; (iii) → → →−Ð Ð →+Ð ba? oa ob; and (iv) Ð oa

Page 291, Table of Contents

www.EconsPhDTutor.com

√ The length (or magnitude) of a 2D vector v = (v1 , v2 ) was defined by v12 + v22 . What then is the length (or magnitude) of a 3D vector? This is the one instance where the analogy from the 2D case to the 3D case is perhaps less than obvious. So let’s explore this issue. Consider the blue point a in the figure below. What is its distance from the origin (0, 0, 0)? In other words, what is the length of the green dotted line?

y

a2

a

x a1

a3 z First let’s calculate the distance of the red point from the origin, in other words the length √ of the red dotted line. By the Pythagorean Theorem, it is a22 + a23 .

√ Now, notice the green dotted line, the red dotted line (length a22 + a23 ), and the blue dotted line (length a1 ) form a right-angled triangle, with the hypothenuse being the green dotted line. Thus, the length of the green dotted line is (again by the Pythagorean Theorem): √

√ √ 2 a21 + ( a22 + a23 ) = a21 + a22 + a23 .

Page 292, Table of Contents

www.EconsPhDTutor.com

We are thus motivated to define the length (or magnitude) of a 3D vector as follows: Definition 81.√ The length (or magnitude) of a vector a = (a1 , a2 , a3 ) is denoted ∣a∣ and defined by ∣a∣ = a21 + a22 + a23 .

This is very much analogous to the definition of the length (or magnitude) of a 2D vector.

Let’s continue with our exercises for 3D vectors: Exercise 118. (Answer on p. 1008.) (a) Compute the lengths of the vectors a = (1, 2, 3), b = (4, 5, 6), and a − b.

(b) Compute the lengths of the vectors 2a = (2, 4, 6), 3b = (12, 15, 18), and 4(a − b).

(c) Compute the unit vectors in the directions a = (1, 2, 3), b = (4, 5, 6), and a − b.

(d) Compute (1, 2, 3) ⋅ (4, 5, 6) and (−2, 4, −6) ⋅ (1, −2, 3).

(e) Compute the angles (i) between the vectors a = (1, 2, 3) and b = (4, 5, 6); and (ii) between the vectors u = (−2, 4, −6) and v = (1, −2, 3). (iii) Are the vectors (−2, 4, −6) and (1, −2, 3) orthogonal?

(f) Compute the length of the projection of a = (1, 2, 3) on b = (4, 5, 6).

(g) Find the point that divides the line segment ab in the ratio 2 ∶ 3.

(h) For each of the following vectors, find their x-, y-, and z-direction cosines. And then write down their unit vectors. (i) (1, 3, −2). (ii) (4, 2, −3). (iii) (−1, 2, −4).

Page 293, Table of Contents

www.EconsPhDTutor.com

30 30.1

Vector Product Vector Product in 2D

Recall that given two 2D vectors u = (ux , uy ) and v = (vx , vy ), their scalar product was the scalar defined by u ⋅ v = ux vx + vx vy . We now define a very similar concept. Definition 82. Given two 2D vectors u = (ux , uy ) and v = (vx , vy ), their vector product (or cross product), denoted u × v, is the scalar defined by u × v = ux vy − uy vx .

Example 268. If u = (1, 2) and v = (3, 4), then u × v = 1 × 4 − 2 × 3 = −2.

Example 269. If p = (−3, 5) and q = (6, 1), then p × q = −3 × 1 − 5 × 6 = −33.

Example 270. If u = (−1, 4) and v = (2, −3), then u × v = (−1) × (−3) − 4 × 2 = −5.

Ordinary multiplication is commutative. This simply means that given any real numbers a, b, we have a × b = b × a. For example,

Example 271. 4 × 7 = 7 × 4 and 3 × 5 = 5 × 3.

In contrast, the vector product is not commutative because u × v ≠ v × u. This might be the first time in your life that you’re encountering a product that isn’t commutative. In fact, the vector product is anticommutative because u × v = −v × u! For example,

Example 272. If u = (1, 2) and v = (3, 4), then u × v = 1 × 4 − 2 × 3 = −2, but v × u = 2 × 3 − 1 × 4 = 2.

Example 273. If u = (−1, 4) and v = (2, −3), then u × v = (−1) × (−3) − 4 × 2 = −5, but v × u = 4 × 2 − (−1) × (−3) = 5. Page 294, Table of Contents

www.EconsPhDTutor.com

Recall that if θ ∈ [0, π] is the angle between two vectors, then based on our definition that u ⋅ v = ux vx + uy vy , we could prove that u ⋅ v = ∣u∣ ∣v∣ cos θ. It turns out based on our definition that u × v = ux vy − uy vx , we can prove a very similar result:38

Fact 29. Let u and v be two non-zero 2D vectors and θ ∈ [0, π] be the angle between them. Then the scalar u × v is equal to either ∣u∣ ∣v∣ sin θ or − ∣u∣ ∣v∣ sin θ. Proof. Optional, see p. 859 in Appendices.

Earlier we already had one formula for calculating the angle between two vectors. Let θ ∈ [0, π] be the angle between u and v. Then θ = cos−1 (

u⋅v ). ∣u∣ ∣v∣

θ = sin−1 ∣

u×v ∣. ∣u∣ ∣v∣

The above Fact now gives us a second formula. Let θ ∈ [0, π] be the acute or right angle between u and v. Then

However, we’ll stick with using only the first cosine formula. We won’t use the second sine formula, mainly because, as we’ll see, computing the vector product is very tedious, especially in the 3D case, where it is a different creature altogether.

38

Footnote 37 explained that the scalar product could be defined in one of two equivalent ways. Similarly, the vector product can be defined in one of two equivalent ways. We can use either definition and then prove that the other is true. (1) In this textbook, we first define the vector product by u × v = ux vy − uy vx ; we then prove that u × v = ± ∣u∣ ∣v∣ sin θ, where θ is the angle between the two vectors. That is, we start with the algebraic definition, then prove a geometric property. The alternative approach is this: π π (2) Define the vector product by u × v = ∣u∣ ∣v∣ sin θ if θ ∈ [0, ] or u × v = − ∣u∣ ∣v∣ sin θ if θ ∈ ( , π] ; then prove that 2 2 u × v = ux vy − uy vx . That is, we start with the geometric definition, then prove an algebraic property.

Page 295, Table of Contents

www.EconsPhDTutor.com

30.2

Areas of Triangles and Parallelograms SYLLABUS ALERT

Calculation of the area of a triangle or parallelogram is included in the 9740 (old) syllabus, but not in the 9758 (revised) syllabus. So you can skip this section if you’re taking 9758. The vector product is also helpful for computing the area of triangles and parallelograms. Fact 30. The triangle with sides of lengths ∣u∣, ∣v∣, and ∣v − u∣ has area 0.5∣u × v∣.

Case #1.

v

|v| sin (π – Ʌ) v |v| sin Ʌ

v–u

v–u Ʌ u

Case #2.

Ʌ u

Proof. Case #1. If the vectors u and v form an acute or right angle θ, then the area of the triangle is simply 0.5 × Base × Height or 0.5 ∣u∣ ∣v∣ sin θ. And by Fact 29, 0.5 ∣u∣ ∣v∣ sin θ = 0.5∣u × v∣.

Case #2. And if the vectors u and v form an obtuse angle θ, then the area of the triangle is again simply 0.5 × Base × Height or 0.5 ∣u∣ ∣v∣ sin(π − θ). Recall that sin(π − θ) = sin π cos θ − sin θ cos π = sin θ. So again the area of the triangle is 0.5 ∣u∣ ∣v∣ sin θ or 0.5∣u × v∣. Example 274. Consider the triangle formed by the points (0, 0), (3, 4), and (5, 6). Its area is simply 0.5 ∣(3, 4) × (5, 6)∣ = 0.5∣3 × 6 − 4 × 5∣ = 1.

Page 296, Table of Contents

www.EconsPhDTutor.com

Fact 31. The parallelogram with sides of lengths ∣u∣ and ∣v∣, and diagonal of length ∣v − u∣ has area ∣u × v∣.

v–u u

ȣ

v

Proof. Such a parallelogram is simply composed of two of the triangles from Fact 30. And so its area is simply twice the area of the triangle, or 2 × 0.5∣u × v∣ = ∣u × v∣.

Page 297, Table of Contents

www.EconsPhDTutor.com

30.3

Vector Product in 3D

The 3D vector product is very different from the 2D vector product. The latter was simply a scalar (real number); in contrast, the 3D vector product is instead a VECTOR! Also previously, we first started with the algebraic definitions. For example, the 3D scalar product was defined as u⋅v = u1 v1 +u2 v2 +u3 v3 and the 2D vector product as u×v = u1 v2 −u2 v1 . We then showed that these algebraic definitions were equivalent to some geometric interpretations. For the vector product in 3D, I will go the other way round. That is, I will start with the (very long) geometric definition, then show that it is equivalent to some algebraic interpretation.

Page 298, Table of Contents

www.EconsPhDTutor.com

Definition 83. Given two distinct 3D vectors u = (ux , uy , uz ) and v = (vx , vy , vz ), their vector product (or cross product), denoted u × v, is the (unique) vector that satisfies 3 properties: 1. u × v is orthogonal (perpendicular) to both u and v.

Let’s see what this first property means. Recall that it doesn’t matter where we put the heads and tails of vectors. So let’s put u and v on the same plane, with their tails at the same point.

u×v

u

Plane

ȣ

v

We see that there are exactly two vectors that are orthogonal to both u and v — the vector pointing up (green) and the vector pointing down (purple). There is thus an ambiguity. Which of these two vectors is u × v?

To resolve this ambiguity, we also require that u × v satisfy a second property:

2. u × v satisfies the right-hand rule: Take your right hand, stick out your thumb, forefinger, and middle finger so that they are perpendicular, your thumb represents the vector u × v, your forefinger the vector u, and your middle finger the vector v. Hence, in the figure, u × v points up (green). (Try it yourself!)

Note that the right-hand rule is a mere convention, but one that everyone has agreed upon. There is no especially compelling reason for using it, other than the fact that left-handed people are an oppressed minority! (If instead we used the left-hand rule, then u × v would point down (purple) (try it!), but for better or worse, we don’t use the left-hand rule.)

The third and last property specifies the length (or magnitude) of u × v.

3. ∣u × v∣ = ∣u∣ ∣v∣ sin θ, where θ ∈ [0, π] is the angle between them. Page 299, Table of Contents

www.EconsPhDTutor.com

Note that θ ∈ [0, π] Ô⇒ sin θ ≥ 0, so that ∣u∣ ∣v∣ sin θ is never negative. (Otherwise, we’d have the distressing possibility that the length of u × v is sometimes negative!)

One implication of this third and last property is that if both u and v point in the same direction (so that θ = 0), then u × v is the zero vector (i.e. u × v = 0).

Fact 32. (a) i × j = k; (b) j × k = i; (c) k × i = j; (d) j × i = −k; (e) k × j = −i; and (f) i × k = −j.

Proof. In each case, use the right-hand rule to show that properties #1 and #2 of Definition 83 are satisfied. π In each case, the length of the cross product is ∣u∣ ∣v∣ sin θ = 1 × 1 × sin = 1. So indeed, 2 property #3 is also satisfied. Ordinary multiplication is distributive: Example 275. 3 × (5 + 11) = 3 × 5 + 3 × 11 and 18 × (7 − 31) = 18 × 7 − 18 − 31.

It turns out that the vector product is likewise distributive:

Fact 33. Let a, b, and c be vectors. Then a × (b + c) = a × b + a × c. Moreover, (a + b) × c = a × c + b × c. Proof. Optional, see p. 860 in the Appendices.

The next proposition gives the promised algebraic interpretation of the vector product. Proposition 7. Given two 3D vectors u = (ux , uy , uz ) and v = (vx , vy , vz ), their vector product is given by: ⎛ uy vz − uz vy u×v=⎜ ⎜ uz vx − ux vz ⎝ ux vy − uy vx

⎞ ⎟. ⎟ ⎠

Proof. Optional, see p. 861 in Appendices. (The proof is actually quite simple. It just involves some tedious algebra.) Page 300, Table of Contents

www.EconsPhDTutor.com

Example 276. If u = (1, 2, 3) and v = (4, 5, 6), then

u × v = (2 × 6 − 3 × 5, 3 × 4 − 1 × 6, 1 × 5 − 2 × 4) = (−3, 6, −3) .

Let’s verify that u × v is orthogonal to u, by computing (u × v) ⋅ u = (−3, 6, −3) ⋅ (1, 2, 3) = −3 + 12 − 9 = 0 ✓. Similarly, let’s verify that u × v is orthogonal to v, by computing (u × v) ⋅ v = (−3, 6, −3) ⋅ (4, 5, 6) = −12 + 30 − 18 = 0 ✓. Example 277. If u = (−1, 3, −5) and v = (2, −4, 6), then

u × v = (3 × 6 − (−5) × (−4), (−5) × 2 − (−1) × 6, (−1) × (−4) − 3 × 2) = (−2, −4, −2) .

Let’s verify that u×v is orthogonal to u, by computing (u × v)⋅u = (−2, −4, −2)⋅(−1, 3, −5) = 2 − 12 + 10 = 0 ✓. Similarly, let’s verify that u × v is orthogonal to v, by computing (u × v) ⋅ v = (−2, −4, −2) ⋅ (2, −4, 6) = −4 + 16 − 12 = 0 ✓. As in the 2D case of the vector product, here again in the 3D case, the vector product is anticommutative, i.e. u × v = −v × u (see Exercise 121). Exercise 119. For each of the following pairs of vectors, compute the vector product and verify that it is orthogonal to each of the two vectors. (a) u = (0, 1, 2) and v = (3, 4, 5). (b) u = (−1, −2, −3) and v = (1, 0, 5). (Answer on p. 1010.) Exercise 120. Verify that in general, u × v is orthogonal to u and v by showing that (u × v) ⋅ u = 0 and that (u × v) ⋅ v = 0. (Answer on p. 1010.)

Exercise 121. (a) Given u = (1, 2, 3) and v = (4, 5, 6), show that v × u = −u × v. (b) Prove that in general, i.e. the 3D vector product is anti-commutative, i.e. u × v = −v × u. (Answer on p. 1010.) Ordinary multiplication is associative. This simply means that (a × b) × c = a × (b × c). Example 278. (2 × 3) × 7 = 2 × (3 × 7) and (8 × 13) × 2 = 8 × (13 × 2). In contrast, the vector product is not associative.

Example 279. If u = (1, 2, 3), v = (4, 5, 6), and w = (1, 0, 1), then (u × v)×w = (−3, 6, −3)× (1, 0, 1) = (6, 0, −6), but u × (v × w) = (1, 2, 3) × (5, 2, −5) = (−16, 20, −8). Page 301, Table of Contents

www.EconsPhDTutor.com

31 31.1

Lines

Lines on a 2D Plane: Cartesian to Vector Equations

In general, a line on a 2D plane can be described by the cartesian equation ax + by + c = 0.

This says that the line consists of exactly those points (x, y) that satisfy the equation ax + by + c = 0.

(You may be more familiar with describing lines in the form y = mx+d. This simply involves a rearrangement of the above equation. But the above equation is preferred because it is more general — it allows for the possibility that the coefficient on y is 0.) Example 280. Consider the line described by the cartesian equation 3x − y + 2 = 0. Rearranging, we get a more familiar-looking equation: y = 3x + 2.

For convenience (but at the cost of some sloppiness), we may even simply identify the line with the cartesian equation.

Example 281. Consider the line 3x − y + 2 = 0. Describing lines using cartesian equations is secondary school stuff. We’ll now learn a second method of describing lines — through vector equations. In general, any line can be described in the form r = p + λv, λ ∈ R,

where r is a generic point on the line, p is some known point on the line, v is a direction vector of the line, and λ is a parameter that can take any real value. Here are some examples to make sense of this.

Page 302, Table of Contents

www.EconsPhDTutor.com

Example 282. Consider the line (on a 2D plane) described by the cartesian equation 3x − y + 2 = 0. It runs through the point (0, 2). A vector that points in the same direction as this line is (1, 3). Hence, we can also describe it using the vector equation r = (0, 2) + λ(1, 3), λ ∈ R.

This says that the line consists of every point r that can be written as (0, 2) + λ(1, 3) for some real number λ. We call λ a parameter. As λ varies, we get different points of the line. So for example, corresponding to λ = 0, 1, and −1, the line contains the points (0, 2) + 0(1, 3) = (0, 2), (0, 2) + 1(1, 3) = (1, 5), and (0, 2)−1(1, 3) = (−1, −1). Of course, it also contains infinitely many other points, one for each value of λ ∈ R.

We call (1, 3) a direction vector of the line. Note that this direction vector is not unique, any scalar multiple thereof, i.e. c(1, 3) with c ∈ R, is also a direction vector of the line!

Again, we can either say “the line is described by the vector equation r = (0, 2) + λ(1, 3)”. OR, for convenience (but at the cost of some sloppiness), we can also say “the line is the very equation r = (0, 2) + λ(1, 3)”.

y 4

Line

2

The point (0, 2)

Cartesian equation 3x - y + 2 = 0

The vector (1, 3) x 0

-4

-2

Vector equation r = (0, 2) + ɉ(1, 3)

0

2

4

-2

-4 Page 303, Table of Contents

www.EconsPhDTutor.com

Example 283. Consider the line (on a 2D plane) described by the cartesian equation x + y − 1 = 0. It runs through the point (0, 1). A vector that points in the same direction as this line is (1, −1). Hence, we can also describe it using the vector equation r = (0, 1) + λ(1, −1), λ ∈ R.

This says that the line consists of every point r that can be written as (0, 1) + λ(1, −1) for some real number λ. Corresponding to λ = 0, 1, and −1, the line contains the points (0, 1) + 0(1, −1) = (0, 1), (0, 1) + 1(1, −1) = (1, 0), and (0, 1)−1(1, −1) = (−1, 2).

Line

Cartesian equation x+y-1=0

y 4

2 The point (0, 1) x 0

-4

-2

0

-2

2

4

Vector equation r = (0, 1) + ɉ(1, -1)

The vector (1, -1)

-4

Page 304, Table of Contents

www.EconsPhDTutor.com

Example 284. Consider the line (on a 2D plane) described by the cartesian equation y − 3 = 0. It runs through the point (0, 3). A vector that points in the same direction as this line is (1, 0). Hence, we can also describe it using the vector equation r = (0, 3) + λ(1, 0), λ ∈ R.

This says that the line consists of every point r that can be written as (0, 3) + λ(1, 0) for some real number λ.Corresponding to λ = 0, 1, and −1, the line contains the points (0, 3) + 0(1, 0) = (0, 3), (0, 3) + 1(1, 0) = (1, 3), and (0, 3)−1(1, 0) = (−1, 3).

y 4

Vector equation r = (0, 3) + ɉ(1, 0)

Line Cartesian equation y-3=0

The point (0, 3) 2

x 0 -4

-2

0

2

4

The vector (1, 0) -2

-4 Exercise 122. Rewrite each of the following lines into vector equation form. (a) −5x + y + 1 = 0. (b) x − 2y − 1 = 0. (c) y − 4 = 0. (d) x − 4 = 0. (Answer on p. 1011.)

Page 305, Table of Contents

www.EconsPhDTutor.com

We just learnt how to describe a line using the vector equation r = p + λv, λ ∈ R. 1

Notice that the LHS of this equation is the generic Point r. And the RHS of the equation is the Point p minus the Vector λv, which equals Point (see p. 267). So LHS and RHS do indeed match up. There is another way to describe a line using a vector equation. We can instead write: r = p + λv, λ ∈ R, 2

where now r is the position vector of a generic point r on the line and p is the position vector of some known point p on the line. So now LHS is a Vector and so too is RHS. Equation = said that the line consists of those points r that could be written as p + λv. In 2 contrast, equation = says that the line consists of those points whose position vector r can be written as p + λv. But both equations can equally well describe the very same line. The difference is a fine and pedantic one and really doesn’t matter much. 1

What matters is that you take care not to write either r = p + λv, λ ∈ R; WRONG! 3

or r = p + λv, λ ∈ R. WRONG! 4

The LHS of = is a Point while the RHS of = is a Vector. Therefore = cannot be true. 3

3

3

The LHS of = is a Vector while the RHS of = is a Point. Therefore = cannot be true. 4

4

4

As usual, this is all very pedantic, but can serve as a useful test of your understanding.

Page 306, Table of Contents

www.EconsPhDTutor.com

31.2

Lines on a 2D Plane: Vector to Cartesian Equations

In the previous section, given the cartesian equation of a line, we worked out its vector equation. Now given its vector equation, we’ll work out its cartesian equation. Suppose a line (on a 2D plane) can be described by the vector equation r = p + λv = (p1 , p2 ) + λ(v1 , v2 ).

where λ ∈ R and v is a non-zero vector.39 And so any point (x, y) on this line must satisfy x = p1 + λv1

and y = p2 + λv2 .

The above are the cartesian equations for a line (on a 2D plane)! But wait a minute ... isn’t there supposed to be just one equation? Well, if we’d like, we can quite easily combine them into a single equation by eliminating the parameter λ. In general:

Fact 34. The line with vector equation r = (p1 , p2 ) + λ(v1 , v2 ) (for λ ∈ R) is the line with cartesian equations as given by the 3 cases below. (1)

x − p1 y − p2 = , v1 v2

(2) x = p1 , y is free,

(3) x is free, y = p2 ,

if v1 , v2 ≠ 0;

if v1 = 0, v2 ≠ 0;

if v1 ≠ 0, v2 = 0;

Note that Case (1) is the most common situation. Proof. Optional, see p. 862 in the Appendices.

Some examples:

39

Otherwise we’d simply be describing the single point p!

Page 307, Table of Contents

www.EconsPhDTutor.com

Example 285. The line described by the vector equation r = (1, 2) + λ(1, 1), where λ ∈ R has cartesian equations x = 1 + λ and y = 2 + λ.

As λ varies between −∞ and ∞, this pair of equations gives us the points that are on the line. For example, when λ = 1, 17, 33, we have the points (2, 3), (18, 20), and (34, 36). We can eliminate λ and reduce the above pair of equations into the single cartesian equation y = x + 1 or y−1 x = . 1 1

Example 286. Consider the line described by the vector equation r = (0, 0)+λ(4, 5), where λ ∈ R has cartesian equations x = 4λ and y = 5λ.

As λ varies between −∞ and ∞, this pair of equations gives us the points that are on the line. For example, when λ = 1, 17, 33, we have the points (4, 5), (68, 85), and (132, 165).

Eliminating λ, we can reduce the above pair of equations to y = 1.25x or x y = . 1.25 1

Example 287. Consider the line described by the vector equation r = (3, 1)+λ(0, 2), where λ ∈ R has cartesian equations x = 3 and y = 1 + 2λ.

As λ varies between −∞ and ∞, this pair of equations gives us the points that are on the line. So in fact, the above equations say that x must always be 3 and y is free to vary along with λ. For example, when λ = 1, 17, 33, we have the points (3, 3), (3, 25), and (3, 67). Hence, the above pair of equations can be reduced to x = 3.

Exercise 123. Rewrite each of the following lines into cartesian equation form. (Answer on p. 1011.) (a) r = (−1, 3) + λ(1, −2), where λ ∈ R.

(b) r = (5, 6) + λ(7, 8), where λ ∈ R.

(c) r = (0, −3) + λ(3, 0), where λ ∈ R.

Page 308, Table of Contents

www.EconsPhDTutor.com

31.3

Lines in 3D Space: Vector Equations

Lines in 2D space are described by the cartesian equation ax + by + c = 0. A reasonable guess might be that lines in 3D space are analogously described by the cartesian equation ax + by + cz + d = 0. Turns out this is wrong! The equation ax + by + cz + d = 0 actually describes a plane, as we’ll see later (Chapter 32). In the 3D case, it’s easier to start by looking at the vector equation of a line. It turns out to be exactly analogous to the 2D case. It can be written as r = a + λv, where λ ∈ R and v is a non-zero 3D vector.

This vector equation says that the line contains every point r can be expressed as (a1 , a2 , a3 )+ λ(v1 , v2 , v3 ), where λ ∈ R is a parameter.

Example 288. Consider the line described by the vector equation r = (1, 2, 3) + λ(0, 1, 1), where λ ∈ R. Corresponding to λ = 0, 1, and −1, the line contains the points (1, 2, 3), (1, 3, 4), and (1, 1, 2).

y 3 (1, 2, 3)

2 1 x 1

(1, 3, 4)

1 2 3 (1, 1, 2)

z

Line r = (1, 2, 3) + ɉ(0, 1, 1)

4

Page 309, Table of Contents

www.EconsPhDTutor.com

Example 289. Consider the line described by the vector equation r = (0, 0, 0) + λ(1, 0, 0), where λ ∈ R. Corresponding to λ = 0, 1, and −1, the line contains the points (0, 0, 0), (1, 0, 0), and (−1, 0, 0).

y 2 (-1, 0, 0) 1

Line r = (0, 0, 0) + ɉ(1, 0, 0) x 1 (1, 0, 0)

z

1 2 (0, 0, 0)

Page 310, Table of Contents

www.EconsPhDTutor.com

31.4

Lines in 3D Space: Vector to and from Cartesian Equations

We now try to work out the cartesian equation of a line in 3D space. Suppose a line can be described by the vector equation r = p + λv = (p1 , p2 , p3 ) + λ(v1 , v2 , v3 ).

where λ ∈ R and v is a non-zero vector.40 And so any point (x, y, z) on this line must satisfy x = p1 + λv1 , y = p2 + λv2 , and z = p3 + λv3 .

The above are the cartesian equations for a line (in 3D space)! These are exactly analogous to the cartesian equations (p. 31.2) in the 2D case. Unlike in the 2D case, it is generally impossible to reduce these equations into a single cartesian equation. However, we can reduce them into two equations. Fact 35. The line with vector equation r = (p1 , p2 , p3 ) + λ(v1 , v2 , v3 ) where λ ∈ R is the line with cartesian equations as given by the 7 cases below. (1)

x − p 1 y − p2 z − p3 = = v1 v2 v3

(2) x = p1 ,

(3) y = p2 , (4) z = p3 ,

y − p2 z − p3 = , v2 v3

x − p1 z − p 3 = , v1 v3

x − p1 y − p2 = , v1 v2

(5) x = p1 , y = p2 , z is free,

(6) x = p1 , z = p3 , y is free,

if v1 , v2 , v3 ≠ 0;

(most common case)

if v1 = 0, v2 , v3 ≠ 0; if v2 = 0, v1 , v3 ≠ 0;

if v3 = 0, v1 , v2 ≠ 0;

if v1 , v2 = 0, v3 ≠ 0;

if v1 , v3 = 0, v2 ≠ 0;

(7) y = p2 , z = p3 , x is free, if v2 , v3 = 0, v1 ≠ 0. Proof. Optional, see p. 863 in the Appendices. 40

If v is a zero vector, then we are simply describing the single point p!

Page 311, Table of Contents

www.EconsPhDTutor.com

The first two examples are where v1 , v2 , and v3 are non-zero (Case 1 of Fact 35). Example 290. Consider the line described by the vector equation r = (1, 2, 3) + λ(4, 5, 6), where λ ∈ R. It can be described by the cartesian equations x = 1 + 4λ, y = 2 + 5λ, and z = 3 + 6λ.

As λ varies between −∞ and ∞, these 3 equations give us the points that are on the line. For example, when λ = 1, 3, 17, we have the points (5, 7, 9), (13, 17, 21), and (69, 87, 105).

By rearranging each equation so that λ is on one side, we can reduce these three equations to just two: x−1 y−2 z−3 = = . 4 5 6

That is, this is the line that contains the points (x, y, z) which satisfy the above cartesian equations. Example 291. Consider the line described by the vector equation r = (0, 0, 0) + λ(2, 3, 5), where λ ∈ R. It can be described by the cartesian equations x = 2λ, y = 3λ, and z = 5λ.

As λ varies between −∞ and ∞, these 3 equations give us the points that are on the line. For example, when λ = 1, 3, 17, we have the points (2, 3, 5), (6, 9, 15), and (34, 51, 85).

By rearranging each equation so that λ is on one side, we can reduce these three equations to just two: x y z = = . 2 3 5

That is, this is the line that contains the points (x, y, z) which satisfy the above cartesian equations.

Page 312, Table of Contents

www.EconsPhDTutor.com

We now look at examples where exactly one of v1 , v2 , or v3 is non-zero (Cases 2, 3, and 4 of Fact 35). In the case where v1 = 0 (but v2 ≠ 0 and v3 ≠ 0), then this is a line that is on the 2D yz plane where x = p1 .

Example 292. Consider the line described by the vector equation r = (1, 2, 3) + λ(0, 5, 6), where λ ∈ R. It can be described by the cartesian equations x = 1, y = 2 + 5λ, and z = 3 + 6λ.

As λ varies between −∞ and ∞, these 3 equations give us the points that are on the line. For example, when λ = 1, 3, 17, we have the points (1, 7, 9), (1, 17, 21), and (1, 87, 102).

We see that x must always be equal to 1.

By rearranging the second and third equations so that λ is on one side, we can reduce these three equations to just two: x = 1,

y−2 z−3 = . 5 6

That is, this is the line that contains the points (x, y, z) which satisfy the above cartesian equations. Similarly, in the case where v2 = 0 (but v1 ≠ 0 and v3 ≠ 0), then this is a line that is on the 2D xz plane where y = p2 .

Example 293. Consider the line described by the vector equation r = (1, 2, 3) + λ(4, 0, 6), where λ ∈ R. It can be described by the cartesian equations x = 1 + 4λ, y = 2, and z = 3 + 6λ.

As λ varies between −∞ and ∞, these 3 equations give us the points that are on the line. For example, when λ = 1, 3, 17, we have the points (5, 2, 9), (13, 2, 21), and (69, 2, 105). We see that y must always be equal to 2.

By rearranging the first and third equations so that λ is on one side, we can reduce these three equations to just two: y = 2,

x−1 z−3 = . 4 6

That is, this is the line that contains the points (x, y, z) which satisfy the above cartesian equations.

Page 313, Table of Contents

www.EconsPhDTutor.com

Finally, in the case where v3 = 0 (but v2 ≠ 0 and v3 ≠ 0), then this is a line that is on the 2D xy plane where z = p3 .

Example 294. Consider the line described by the vector equation r = (1, 2, 3) + λ(4, 5, 0), where λ ∈ R. It can be described by the cartesian equations x = 1 + 4λ, y = 2 + 5λ, and z = 3.

As λ varies between −∞ and ∞, these 3 equations give us the points that are on the line. For example, when λ = 1, 3, 17, we have the points (5, 7, 3), (13, 17, 3), and (69, 87, 3).

We see that z must always be equal to 3.

By rearranging the first and second equations so that λ is on one side, we can reduce these three equations to just two: z = 3,

x−1 y−2 = . 4 5

That is, this is the line that contains the points (x, y, z) which satisfy the above cartesian equations.

We now look at examples where exactly two of v1 , v2 , or v3 are zero (Cases 5, 6, and 7 of Fact 35). In the case where v1 = 0 and v2 = 0, but v3 ≠ 0, then this is a line that runs through the points (p1 , p2 , λ) for λ ∈ R.

Example 295. Consider the line described by the vector equation r = (1, 2, 3) + λ(0, 0, 6), where λ ∈ R. It can be described by the cartesian equations x = 1, y = 2, and z = 3 + 6λ.

As λ varies between −∞ and ∞, these 3 equations give us the points that are on the line. For example, when λ = 1, 3, 17, we have the points (1, 2, 9), (1, 2, 21), and (1, 2, 105). We see that x and y must always be equal to 1 and 2. Hence, the above equations simply reduce to: x = 1,

y = 2.

That is, this is the line that contains the points (x, y, z) which satisfy the above cartesian equations. These are the points (1, 2, λ), where λ can be any real.

Page 314, Table of Contents

www.EconsPhDTutor.com

Similarly, in the case where v1 = 0 and v3 = 0, but v2 ≠ 0, then this is a line that runs through the points (p1 , λ, p3 ) for λ ∈ R.

Example 296. Consider the line described by the vector equation r = (1, 2, 3) + λ(0, 5, 0), where λ ∈ R. It can be described by the cartesian equations x = 1, y = 2 + 5λ, and z = 3.

As λ varies between −∞ and ∞, these 3 equations give us the points that are on the line. For example, when λ = 1, 3, 17, we have the points (1, 7, 3), (1, 17, 3), and (1, 87, 3). We see that x and z must always be equal to 1 and 3. Hence, the above equations simply reduce to: x = 1,

z = 3.

That is, this is the line that contains the points (x, y, z) which satisfy the above cartesian equations. These are the points (1, λ, 3), where λ can be any real. In the case where v2 = 0 and v3 = 0, but v1 ≠ 0, then this is a line that runs through the points (λ, p2 , p3 ) for λ ∈ R.

Example 297. Consider the line described by the vector equation r = (1, 2, 3) + λ(4, 0, 0), where λ ∈ R. It can be described by the cartesian equations x = 1 + 4λ, y = 2, and z = 3.

As λ varies between −∞ and ∞, these 3 equations give us the points that are on the line. For example, when λ = 1, 3, 17, we have the points (5, 2, 3), (13, 2, 3), and (69, 2, 3).

We see that y and z must always be equal to 2 and 3. Hence, the above equations simply reduce to: y = 2,

z = 3.

That is, this is the line that contains the points (x, y, z) which satisfy the above cartesian equations. These are the points (λ, 2, 3), where λ can be any real. Exercise 124. Rewrite each of the following vector equation descriptions of lines into cartesian equations describing the same line. (a) r = (−1, 1, 1) + λ(3, −2, 1), where λ ∈ R. (b) r = (5, 6, 1) + λ(7, 8, 1), where λ ∈ R. (c) r = (0, −3, 1) + λ(3, 0, 1), where λ ∈ R. (d) r = (9, 9, 9) + λ(1, 0, 0), where λ ∈ R. (Answer on p. 1012.)

Page 315, Table of Contents

www.EconsPhDTutor.com

SYLLABUS ALERT The 9740 (old) List of Formulae contains the following statement, but not the 9758 (revised) List of Formulae! “If A is the point with position vector a = a1 i + a2 j + a3 k and the direction vector b is given by b = b1 i + b2 j + b3 k, then the straight line through A with direction vector b has cartesian equation x − a1 y − a2 z − a3 = = (= λ).” b1 b2 b3

By the way, the above statement printed in the 9740 (old) List of Formulae is false (*gasp*), because it fails to specify that b1 , b2 , b3 must be non-zero. (The correct statement was just given as Fact 35.) Consider for example, the point (0, 0, 0) and the direction vector b given by b = j + k. Then contrary to the above statement, the straight line through A with direction vector b does not have cartesian equation x y z = = , 0 1 1

because x/0 is undefined. This is the common mistake to which I devoted an entire chapter (Chapter 2) earlier in this book. This seems like a very pedantic point to make, but dividing by zero has been the cause of the downfall of many a student (and in this case some folks at MOE or wherever the heck these things are written).

Page 316, Table of Contents

www.EconsPhDTutor.com

In the examples and exercise we just went through, we started with a vector equation of a line and then wrote down the line’s cartesian equations. Now we’ll go the other way round, starting with the line’s cartesian equations, then write down the vector equations. Example 298. Consider a line described by the cartesian equations 3x − 4 2y − 18 z − 1 = = . 6 5 3

In order to directly apply Fact 35, you must make sure that the coefficients on x, y, and z are all 1! So first rewrite the above into x − 4/3 y − 9 z − 1 = = . 2 2.5 3

And now by Fact 35, we can immediately describe this line by the vector equation r = (4/3, 9, 1) + λ(2, 2.5, 3), for λ ∈ R.

Example 299. Consider a line described by the cartesian equations 5x y − 13 3z − 14 = = . 2 6 8

x y − 13 z − 14/3 Rewrite these into 2 = = 8 . /5 6 /3

And so by Fact 35, we can immediately describe this line by the vector equation r = (0, 13, 14/3) + λ(2/5, 6, 8/3), for λ ∈ R.

Example 300. Consider a line described by the cartesian equations 2x = 17 and 3z = 4. Rewrite them into x = 8.5 and z = 4/3. And so by Fact 35, we can immediately describe this line by the vector equation r = (8.5, 0, 4/3) + λ(0, 1, 0), for λ ∈ R. Exercise 125. Rewrite each of the following cartesian equation descriptions of lines into a vector equation describing the same line. (Answer on p. 1013.) (a)

7x − 2 0.3y − 5 8z = = . 5 7 7

(c) 17x − 4 =

Page 317, Table of Contents

3y − 1 = 3z. 2

(d)

(b) 2x = 3y = 5z.

x − 3 5z − 2 = , 3y = 11. 2 7

www.EconsPhDTutor.com

31.5

Collinearity

Definition 84. A set of points are said to be collinear if there is some line that contains all of these points. Any two points are always collinear — simply take the line that passes through both of them. In contrast, three points may not be collinear. To check whether three points are collinear, 1. First take the line that passes through two of the points. 2. Then check whether the third point is on this line.

c

b

c

a a

a, b, and c are not collinear.

a, b, and c are collinear.

b

Page 318, Table of Contents

www.EconsPhDTutor.com

Example 301. Are the points a = (1, 2, 3), b = (4, 5, 6), and c = (7, 8, 9) collinear?

First take the line through a and b. The vector from a to b is (3, 3, 3) and the line passes through a. Hence, the line can be written as r = (1, 2, 3) + λ(3, 3, 3) (λ ∈ R).

Then check whether c is on the line: Is there λ such that c = (7, 8, 9) = (1, 2, 3) + λ(3, 3, 3)? Rearranging, we have (6, 6, 6) = λ(3, 3, 3), which we can write out as: 6 = 3λ,

6 = 3λ.

6= 3λ,

Clearly, all three of the above equations are true if λ = 2. And so c is also on the line. Hence, the three points are collinear.

a, b, and c are collinear.

d, e, and f are not collinear. f = (0, 0, 1)

c = (7, 8, 9)

= (-1, 1, 0)

= (3, 3, 3)

e = (0, 1, 0)

b = (4, 5, 6) a = (1, 2, 3)

d = (1, 0, 0)

Example 302. Are the points d = (1, 0, 0), e = (0, 1, 0), and f = (0, 0, 1) collinear?

First take the line through d and e . The vector from d to e is (−1, 1, 0) and the line passes through d. Hence, the line can be written as r = (1, 0, 0) + λ(−1, 1, 0) (λ ∈ R).

Then check whether f is on the line: Is there λ such that f = (0, 0, 1) = (1, 0, 0)+λ(−1, 1, 0)? Rearranging, we have (−1, 0, 1) = λ(−1, 1, 0), which we can write out as: −1 = −λ,

0= λ,

1 = 0.

Clearly, there is no λ such that the above three equations can be true. And so the point f is not on the line through d and e. Hence, the three points are not collinear. Exercise 126. Determine whether each of the following set of three points are collinear. (a) a = (3, 1, 2), b = (1, 6, 5), and c = (0, −1, 0). (b) a = (1, 2, 4), b = (0, 0, 1), and c = (3, 6, 10). (Answer on p. 1013.) Page 319, Table of Contents

www.EconsPhDTutor.com

32

Planes

Informally, a plane is a flat 2D surface in 3D space. How do we describe it using equations? Here are two very useful clues: 1. u ⊥ v ⇐⇒ u ⋅ v = 0.

In words: Two vectors are orthogonal if and only if their scalar product is 0. 2. Since the plane is a flat surface, there must be some vector n that is orthogonal (perpendicular) to this plane. That is, n is orthogonal to every vector on the plane. We call n the plane’s normal vector (hence the use of the letter n). Is the normal vector unique? No, because any other vector cn (where c is any scalar) serves equally well as a normal vector. In the figure below, n is a normal vector to the illustrated plane. So too is 0.5n. And so too is −n.

But otherwise, besides cn, there are no other vectors that are also orthogonal to the plane. That is, any vector that cannot be written in the form cn is not orthogonal to the plane.

Black vectors Plane are on the plane n (a normal 0.5n (Also a vector) normal vector) -n (Also a normal vector)

Page 320, Table of Contents

www.EconsPhDTutor.com

Suppose a plane contains some point p = (p1 , p2 , p3 ) and has a normal vector n = (a, b, c). Now consider any point r on the plane. We can construct the vector r − p.

This vector r − p lies on the plane and must therefore be orthogonal to n, the plane’s normal vector. So, for any point r that lies on the plane, we have (r − p) ⋅ n = 0.

q (point not on the plane)

n is normal to r – p , but not to q – p.

q – p (vector not on the plane) n (a normal vector)

p (point on the plane) r – p (vector on the plane)

Plane r1 (point on the plane)

Now consider any point q that is not on the plane. We can construct the vector q − p.

This vector q − p does not lie on the plane and must therefore not be orthogonal to n, the plane’s normal vector. So, for any point q that does not lie on the plane, we have (q − p) ⋅ n ≠ 0.

Altogether then, we conclude: A point r is on the plane if and only if Fact 36. Suppose a plane contains point p and has normal vector n. Then the plane contains exactly those points r such that (r − p) ⋅ n = 0. Page 321, Table of Contents

www.EconsPhDTutor.com

Recall that a line can be described by the vector equation r = p + λv where r is a generic point on the line and p is a known point on the line. Alternatively, it can also be described by the vector equation r = p + λv, where r is the position vector of r and p is the position vector of p. On the previous page, we proved that if a plane that contains point p and has normal vector n, then it may be described by the vector equation (r − p) ⋅ n = 0, where r is a generic point on the plane. Similar to a line, the same plane can also be described by the vector equation (r − p) ⋅ n = 0. By the distributivity of the scalar product, we have:

(r − p) ⋅ n = 0 ⇐⇒ r ⋅ n − p ⋅ n = 0 ⇐⇒ r ⋅ n = p ⋅ n.

Now, p is known (it is the position vector of a point p known to be on the plane). So too is n (it is the plane’s normal vector). Thus, p ⋅ n is simply some known number. So we can describe the plane even more simply by the vector equation

where d = p ⋅ n.

Page 322, Table of Contents

r ⋅ n = d,

www.EconsPhDTutor.com

Example 303. Consider a plane that contains the point p = (1, 2, 3) and has normal vector (1, 1, 0). We compute ⎛1⎞ ⎛1 ⎟ ⎜ d=p⋅n=⎜ ⎜ 2 ⎟⋅⎜ 1 ⎝3⎠ ⎝0

⎞ ⎟ = 1 × 1 + 2 × 1 + 3 × 0 = 3. ⎟ ⎠

We thus conclude that the plane may be described by the vector equation r ⋅ (1, 1, 0) = 3.

This says that the plane contains exactly every point r, whose position vector r satisfies the above equation. For example, the points r1 = (3, 0, 0), r2 = (0, 3, 5), and r3 = (1, 2, −1) are on the plane, because their position vectors r1 = (3, 0, 0), r2 = (0, 3, 5), and r3 = (1, 2, −1) satisfy the above equation, as we can easily verify: ⎛3⎞ ⎛1 ⎟ ⎜ r1 ⋅ n = ⎜ ⎜ 0 ⎟⋅⎜ 1 ⎝0⎠ ⎝0

⎛0⎞ ⎛1 ⎟ ⎜ r2 ⋅ n = ⎜ ⎜ 3 ⎟⋅⎜ 1 ⎝5⎠ ⎝0

⎞ ⎟ = 3 × 1 + 0 × 1 + 0 × 0 = 3. ⎟ ⎠

⎞ ⎟ = 0 × 1 + 3 × 1 + 5 × 0 = 3. ⎟ ⎠

⎛ 1 ⎞ ⎛1 ⎟ ⎜ r3 ⋅ n = ⎜ ⎜ 2 ⎟⋅⎜ 1 ⎝ −1 ⎠ ⎝ 0

⎞ ⎟ = 1 × 1 + 2 × 1 + (−1) × 0 = 3. ⎟ ⎠

Lest you be sceptical that a plane could be described so simply, let’s verify that two vectors on the plane are indeed orthogonal to the normal vector n. First consider r2 −r1 = (0, 3, 5)− (3, 0, 0) = (−3, 3, 5) — this is a vector on the plane. We can verify that indeed ⎛ −3 ⎞ ⎛ 1 ⎟ ⎜ (r2 − r1 ) ⋅ n = ⎜ ⎜ 3 ⎟⋅⎜ 1 ⎝ 5 ⎠ ⎝0

⎞ ⎟ = −3 × 1 + 3 × 1 + 5 × 0 = 0. ⎟ ⎠

Next consider p − r3 = (1, 2, 3) − (1, 2, −1) = (0, 0, 4) — this is also a vector on the plane. We can verify that indeed ⎛0⎞ ⎛1 ⎟ ⎜ (p − r3 ) ⋅ n = ⎜ ⎜ 0 ⎟⋅⎜ 1 ⎝4⎠ ⎝0

Page 323, Table of Contents

⎞ ⎟ = 0 × 1 + 0 × 1 + 4 × 0 = 0. ⎟ ⎠

www.EconsPhDTutor.com

Example 304. Consider a plane that contains the point p = (0, 0, 1) and has normal vector (2, −1, 1). We compute ⎛0⎞ ⎛ 2 ⎟ ⎜ d=p⋅n=⎜ ⎜ 0 ⎟ ⋅ ⎜ −1 ⎝1⎠ ⎝ 1

⎞ ⎟ = 0 × 2 + 0 × (−1) + 1 × 1 = 1. ⎟ ⎠

We thus conclude that the plane may be described by the vector equation r ⋅ (2, −1, 1) = 1.

This says that the plane contains exactly every point r, whose position vector r satisfies the above equation. For example, the points r1 = (1, 1, 0), r2 = (0, 1, 2), and r3 = (1, 2, 1) are on the plane, because their position vectors r1 = (1, 1, 0), r2 = (0, 1, 2), and r3 = (1, 2, 1) satisfy the above equation, as we can easily verify: ⎛1⎞ ⎛ 2 ⎟ ⎜ r1 ⋅ n = ⎜ ⎜ 1 ⎟ ⋅ ⎜ −1 ⎝0⎠ ⎝ 1

⎞ ⎟ = 1 × 2 + 1 × (−1) + 0 × 1 = 1. ⎟ ⎠

⎛1⎞ ⎛ 2 ⎟ ⎜ r3 ⋅ n = ⎜ ⎜ 2 ⎟ ⋅ ⎜ −1 ⎝1⎠ ⎝ 1

⎞ ⎟ = 1 × 2 + 2 × (−1) + 1 × 1 = 1. ⎟ ⎠

⎛0⎞ ⎛ 2 ⎟ ⎜ r2 ⋅ n = ⎜ ⎜ 1 ⎟ ⋅ ⎜ −1 ⎝2⎠ ⎝ 1

⎞ ⎟ = 0 × 2 + 1 × (−1) + 2 × 1 = 1. ⎟ ⎠

Lest you be sceptical that a plane could be described so simply, let’s verify that two vectors on the plane are indeed orthogonal to the normal vector n. First consider r2 −r1 = (0, 1, 2)− (1, 1, 0) = (−1, 0, 2) — this is a vector on the plane. We can verify that indeed ⎛ −1 ⎞ ⎛ 2 ⎟ ⎜ (r2 − r1 ) ⋅ n = ⎜ ⎜ 0 ⎟ ⋅ ⎜ −1 ⎝ 2 ⎠ ⎝ 1

⎞ ⎟ = (−1) × 2 + 0 × (−1) + 2 × 1 = 0. ⎟ ⎠

Next consider p − r3 = (0, 0, 1) − (1, 2, 1) = (−1, −2, 0) — this is also a vector on the plane. We can verify that indeed ⎛ −1 ⎞ ⎛ 2 ⎟ ⎜ (p − r3 ) ⋅ n = ⎜ ⎜ −2 ⎟ ⋅ ⎜ −1 ⎝ 0 ⎠ ⎝ 1

Page 324, Table of Contents

⎞ ⎟ = (−1) × 2 + (−2) × (−1) + 0 × 1 = 0. ⎟ ⎠

www.EconsPhDTutor.com

We just learnt how to write down the vector equation of a plane, given a point on the plane and its normal vector. We now do the same, but given instead three points on a plane. Example 305. A plane contains the points a = (1, 2, 3), b = (4, 5, 8), and c = (2, 3, 5). Ð → → = (1, 1, 2) are on the plane. Hence, a normal vector to the Both vectors ab = (3, 3, 5) and Ð ac Ð → → plane is ab × Ð ac = n = (1, −1, 0). Since a ⋅ n = −1, the plane can be described by the vector equation r ⋅ (1, −1, 0) = −1.

Example 306. A plane contains the points a = (1, 0, 0), b = (0, 1, 0), and c = (0, 0, 1). Ð → → = (−1, 0, 1) are on the plane. Hence, a normal vector to Both vectors ab = (−1, 1, 0) and Ð ac Ð → → the plane is ab × Ð ac = n = (1, 1, 1). Since a ⋅ n = 1, the plane can be described by the vector equation r ⋅ (1, 1, 1) = 1.

Page 325, Table of Contents

www.EconsPhDTutor.com

We now write down the vector equation of a plane, given two points and a vector on the plane. Example 307. A plane contains the points a = (0, 0, 3) and b = (1, 4, 5), and the vector v = (3, 2, 1). Ð → Both vectors ab = (1, 4, 2) and v = (3, 2, 1) are on the plane. Hence, a normal vector to the Ð → plane is ab × v = n = (0, 5, −10). Since a ⋅ n = −30, the plane can be described by the vector equation r ⋅ (0, 5, −10) = −30.

Example 308. A plane contains the points a = (8, −2, 0) and b = (3, 6, 9), and the vector v = (0, 1, 1). Ð → Both vectors ab = (−5, 8, 9) and v = (0, 1, 1) are on the plane. Hence, a normal vector to Ð → the plane is ab × v = n = (−1, 5, −5). Since a ⋅ n = −18, the plane can be described by the vector equation r ⋅ (−1, 5, −5) = −18.

Page 326, Table of Contents

www.EconsPhDTutor.com

32.1

Planes: Vector to Cartesian Equations

Let n = (a, b, c) be the normal vector of a plane. Let p = (p1 , p2 , p3 ) be a point on the plane. Then the plane can be described by the vector equation r ⋅ n = p ⋅ n,

where r = (x, y, z) is the position vector of a generic point on the plane. Writing out the vectors in the above equation explicitly, we have: ⎛x⎞ ⎛a ⎜ y ⎟⋅⎜ b ⎜ ⎟ ⎜ ⎝z ⎠ ⎝ c

⎞ ⎛ p1 ⎞ ⎛ a ⎟=⎜ p ⎟⋅⎜ b ⎟ ⎜ 2⎟ ⎜ ⎠ ⎝ p3 ⎠ ⎝ c

⎞ ⎟, ⎟ ⎠

or ax + by + cz = ap1 + bp2 + cp3 .

This last equation is the cartesian equation description of the same plane. Note, once again, that d = ap1 + bp2 + cp3 is simply some known number. So this cartesian equation simply says that the plane contains exactly those points (x, y, z) that satisfy the equation ax + by + cz = d.

Example 309. The plane with vector equation r ⋅ (1, 1, 0) = 3 has cartesian equation x + y = 3. Example 310. The plane with vector equation r ⋅ (2, −1, 1) = 1 has cartesian equation 2x − y + z = 1.

Example 311. The plane with vector equation r ⋅ (1, −1, 0) = −1 has cartesian equation x − y = −1.

Example 312. The plane with vector equation r ⋅ (1, 1, 1) = 1 has cartesian equation x + y + z = 1.

Example 313. The plane with vector equation r ⋅ (0, 5, −10) = −30 has cartesian equation 5y − 10z = −30.

Example 314. The plane with vector equation r ⋅ (−1, 5, −5) = −18 has cartesian equation −x + 5y − 5z = −18.

Page 327, Table of Contents

www.EconsPhDTutor.com

It’s thus easy to go back and forth between a plane’s vector and cartesian equations: r ⋅ (a, b, c) = d

⇐⇒

ax + by + cz = d.

Example 315. Given a plane with cartesian equation 2x + 3y + 5z = −7, we immediately know that it has vector equation r ⋅ (2, 3, 5) = −7.

Example 316. Given a plane with cartesian equation 2x + 3z = −5, we immediately know that it has vector equation r ⋅ (2, 0, 3) = −5.

Here’s a nice observation: Every plane that contains the origin (0, 0, 0) can be written in the form ax + by + cz = 0. Conversely, every plane that does not contain the origin can be written in the form ax + by + cz = 1. Formally: Fact 37. A plane r ⋅ n = d contains the origin ⇐⇒ d = 0.

Proof. Given a plane r ⋅ n = d, the origin is on the plane (and thus satisfies this equation) ⇐⇒ 0 ⋅ n = d = 0. SYLLABUS ALERT The following statement is in the old but not the new List of Formulae. “The plane through A with normal vector n = n1 i + n2 j + n3 k has cartesian equation n1 x + n2 y + n3 z + d = 0

where

d= −a ⋅ n.”

Exercise 127. Find the vector and cartesian equations that describe the planes containing each of the following set of three points: (Answer on p. 1014.) (a) a = (7, 3, 4), b = (8, 3, 4), and c = (9, 3, 7). (b) a = (8, 0, 2), b = (4, 4, 3), and c = (2, 7, 2). (c) a = (8, 5, 9), b = (8, 4, 5), and c = (5, 6, 0).

Exercise 128. Write down the vector equations of the planes whose cartesian equations are as given: (Answer on p. 1015.) (a) 3x + 2y + 5z = −3. (b) 2y + 5z = −3. (c) 5z = −3. Page 328, Table of Contents

www.EconsPhDTutor.com

32.2

Planes: Hessian Normal Form

Example 317. The plane with vector equation r ⋅ (1, 0, 1) = 11 or cartesian equation x + z = 11 can be described in an infinite number of ways. For example, the same plane can also be described by any of the following four equations: r ⋅ (2, 0, 2) = 22, r ⋅ (1/11, 0, 1/11) = 1, 2x + 2z = 22, and x/11 + z/11 = 1.

If you talk about the plane r ⋅ (2, 0, 2) = 22 and I talk about the plane x/11x + z/11z = 1, it make take us a moment to realise that we are talking about the exact same plane. To save ourselves such trouble, it may be desirable to describe planes in a standardised form, called the Hessian normal form. ˆ as our normal vector. However, This involves simply picking the unit normal vector n there are two possible unit normal vectors, one pointing “up” and the other pointing “down”. ˆ ≥ 0, so that the RHS of our We will choose the unit normal vector that ensures that p ⋅ n vector or cartesian equation in Hessian normal form is always non-negative. ̂ Example 318. Consider the plane r ⋅ (1, 0, 1) = 11 or x + z = 11. We have (1, 0, 1) = √ √ ( 2/2, 0, 2/2). And so the plane can be rewritten in Hessian normal form as r ⋅ √ √ √ √ √ √ ( 2/2, 0, 2/2) = 11 2/2 or ( 2/2) x + ( 2/2) z = 11 2/2. ˆ is uniquely defined. Indeed, Notice that in the Hessian normal form, the number dˆ = p ⋅ n it is the distance of the plane from the origin! (We’ll prove this in section 33.2.)

̂ Example 319. Consider the plane r ⋅ (8, 1, 3) = −3 or 8x + y + 3z = −3. We have (8, 1, 3) = √ √ √ (8/ 74, 1// 74, 3// 74). Note though that right now, the RHS is negative. So in order to ensure that dˆ ≥ 0 (as required by the Hessian normal form), we need simply reverse the √ √ √ sign of our unit normal vector — that is, we should pick (−8/ 74, −1/ 74, −3/ 74) as our unit normal then, can be rewritten √ vector. √ Altogether √ √ the plane √ √ in Hessian √ normal form √ as r ⋅ (−8/ 74, −1/ 74, −3/ 74) = 3/ 74 or (−8/ 74) x − (1/ 74) y − (3/ 74) z = (3/ 74). Exercise 129. Rewrite each of the following planes’ vector equation into Hessian normal form. (Answer on p. 1015.) (a) r ⋅ (3, 6, 2) = 4.

Page 329, Table of Contents

(b) r ⋅ (1, 2, 2) = −1.

(c) r ⋅ (8, 1, 4) = 0.

www.EconsPhDTutor.com

33

Distances

Before we proceed, here are some useful things to remember. A line can be fully determined by 1. Any two distinct points. 2. Any vector and a point. Similarly, a plane can be fully determined by 1. Any three distinct points. 2. Any two distinct points and a distinct vector.41 3. Two distinct vectors and a point.

41

Ð → If the two points are a and b, then the vector must be distinct from cab, for any c ∈ R.

Page 330, Table of Contents

www.EconsPhDTutor.com

33.1

Distance of a Point from a Line

Definition 85. The foot of the perpendicular from a point a to a line l is the point b on the line l that is closest to the point a. The distance between the point a and the line l is the length of the line segment ab.

Distance between a and b

a

b p

Note that the line ab must be perpendicular to the line l. Hence the name foot of the perpendicular.

Rather than try to memorise the following proposition, it’s easier to just remember how the proof works:

Page 331, Table of Contents

www.EconsPhDTutor.com

Proposition 8. Given a point a and a line r = p +√ λv (for λ ∈ R), 2 → 2 − (Ð →⋅v ˆ ) ; and (a) The distance between the point and the line is ∣Ð pa∣ pa

→⋅v ˆ) v ˆ. (b) The foot of the perpendicular from the point to the line is the point p + (Ð pa

Proof. Let b be the foot of the perpendicular from the point to the line. (a) Pick any known point on the line — here the obvious choice is p. Consider the right→ and base of length ∣Ð →⋅v ˆ ∣ (refer angled triangle △bpa — it has hypothenuse of length ∣Ð pa∣ pa to the diagram above). Hence, by the Pythagorean Theorem, the length of line segment ab (or the distance between the point a and the line l) is: √ 2 → 2 − (Ð →⋅v ˆ ) , as desired. ∣Ð pa∣ pa

Ð → →⋅v ˆ ∣ away from the point p, heading in the direction pb. (b) The point b is a distance ∣Ð pa ̂ Ð → ∗ →⋅v ˆ ∣ pb. There are two possible cases to examine. Hence b = p + ∣Ð pa Ð → ˆ is pointing in the same direction as pb. Case #1 : v

̂ Ð → ∗ →⋅v →⋅v →⋅v ˆ and Ð ˆ > 0, so that ∣Ð ˆ∣ = Ð ˆ . Altogether then, = becomes b = pa pa pa Then pb = v →⋅v →⋅v ˆ∣ v ˆ = p + (Ð ˆ) v ˆ , as desired. ✓ p + ∣Ð pa pa Ð → ˆ and pb are pointing in opposite directions. Case #2 : v

̂ Ð → ∗ →⋅v →⋅v →⋅v ˆ < 0, so that ∣Ð ˆ ∣ = −Ð ˆ . Altogether then, = becomes b = Then pb = −ˆ v and Ð pa pa pa →⋅v →⋅v ˆ ) (−ˆ ˆ ) (ˆ p + (−Ð pa v ) = p + (Ð pa v), as desired. ✓

On p. 864 in the Appendices (optional), I give another proof of the above Proposition using calculus. The idea of this second proof will be illustrated in the last two examples of this section.

Page 332, Table of Contents

www.EconsPhDTutor.com

Example 320. Consider the point a = (1, 2, 3) and the line r = (0, 1, 2) + λ(9, 1, 3) (λ ∈ R). → = (1, 1, 1) and so ∣Ð → 2 = 12 +12 +12 = 3. Pick a point on the line – say p = (0, 1, 2). We have Ð pa pa∣ Also, (1, 1, 1) ⋅ (9, 1, 3) 13 Ð →⋅v ˆ= √ pa =√ . 91 92 + 12 + 32

2 →⋅v ˆ ) = 169/91. Hence, the length of the side is pa And so (Ð √ √ √ 169 104 8 = = ≈ 1.069. 3− 91 91 7

This is the distance between point a and the line l. Moreover, 13 (9, 1, 3) 1 √ b = (0, 1, 2) + √ = (9, 8, 17) . 7 91 91

Not to scale.

a = (1, 2, 3) Distance between a and b is 1.069

l b=

(9, 8, 17)

p = (0, 1, 2) Ð → →⋅v ˆ > 0. Note that in this example, v and pb do point in the same direction and we have Ð pa Ð → In contrast, in the next example, v and pb will point in opposite directions and we will →⋅v ˆ < 0. have Ð pa

Page 333, Table of Contents

www.EconsPhDTutor.com

Example 321. Consider the point a = (−1, 0, 1) and the line r = (3, 2, 1)+λ(5, 1, 2) (λ ∈ R). → = (−4, −2, 0) and so ∣Ð → 2 = 42 +22 +02 = Pick a point on the line – say p = (3, 2, 1). We have Ð pa pa∣ 20. Also, (−4, −2, 0) ⋅ (5, 1, 2) 22 Ð →⋅v ˆ= √ pa = −√ . 30 52 + 12 + 22

Ð → →⋅v ˆ < 0 and sure enough, v and pb point in the opposite directions.) So (As noted, Ð pa 2 →⋅v ˆ ) = 484/30. Hence, the length of the side is (Ð pa √ √ √ 116 58 484 20 − = = ≈ 2.823. 30 30 15 This is the distance between point a and the line l. Moreover,

22 (5, 1, 2) 1 √ b = (3, 2, 1) − √ = (−10, 19, −7) . 15 30 30

Not to scale.

a = (-1, 0, 1) Distance between a and b is 2.823

l b=

(-10, 9, -7)

p = (3, 2, 1)

Page 334, Table of Contents

www.EconsPhDTutor.com

Exercise 130. For each of the following, find (i) the distance between the given point a and the given line l; and also (ii) the point b on the line that is closest to a. (Answers on pp. 1016, 1017, and 1018.) (a) The point a = (7, 3, 4) and the line l described by r = (8, 3, 4) + λ(9, 3, 7).

(b) The point a = (8, 0, 2) and the line l described by r = (4, 4, 3) + λ(2, 7, 2).

(c) The point a = (8, 5, 9) and the line l described by r = (8, 4, 5) + λ(5, 6, 0).

Page 335, Table of Contents

www.EconsPhDTutor.com

We now learn a second method for finding the foot of a perpendicular and hence the distance of a point to a line. This second method involves calculus and finding the minimum point. It also occasionally features on the A-level exams. Example 322. Consider the point a = (1, 2, 3) and the line r = (0, 1, 2) + λ(9, 1, 3) (λ ∈ R). The distance between a and a generic point r on the line is RRR RRR⎛ 1 ⎞ ⎛ 9λ ∣a − r∣ = RRRR⎜ 2 ⎟−⎜ 1+λ RRR⎜ ⎟ ⎜ RRR⎝ 3 ⎠ ⎝ 2 + 3λ =

√

(1 − 9λ)2

R R ⎞RRRR RRRR⎛ 1 − 9λ ⎟RRRR = RRRR⎜ 1 − λ ⎟RR RR⎜ ⎠RRRR RRRR⎝ 1 − 3λ R R

+ (1 − λ)2

R ⎞RRRR ⎟RRRR ⎟RR ⎠RRRR R

+ (1 − 3λ)2

√ = 91λ2 − 26 + 3.

Recall what λ means. It is a parameter — as λ varies, the vector equation r = (0, 1, 2) + λ(9, 1, 3) gives us another point of the line. √ So now the expression 91λ2 − 26 + 3 tells us: √ As λ varies, the distance between the point a and the corresponding point r on the line is 91λ2 − 26λ + 3.

Our goal is to find the point √ r on the line that is closest to the point a. In other 2 words, √ our goal is to minimise 91λ − 26λ + 3. So we can look for the minimum point of 91λ2 − 26λ + 3. √ To simplify matters, note that minimising 91λ2 − 26λ + 3 is the same as minimising 91λ2 − 26λ + 3. So we might as well look for the minimum point of 91λ2 − 26λ + 3. To this end: d set (91λ2 − 26 + 3) = 182λ − 26 = 0 dλ

⇐⇒

λ=

26 1 = . 182 7

Altogether then, the point b on the line l that is closest to the point a has parameter λ = 1/7. So b = (0, 1, 2) + 1/7(9, 1, 3) = 1/7(9, 8, 17). And the distance between a and l (or equivalently, the length of the line segment ab) is √ 91λ2 − 26λ + 3 =

√

1 2 1 91 ( ) − 26 ( ) + 3 = 7 7

√

8 . 7

Of course, these are the same as what we found in Example 320 a few pages ago. Page 336, Table of Contents

www.EconsPhDTutor.com

Example 323. Consider the point a = (−1, 0, 1) and the line described by the vector equation r = (3, 2, 1) + λ(5, 1, 2) (λ ∈ R). The distance between a and a generic point r on the line is RRR RRR⎛ −1 ⎞ ⎛ 3 + 5λ ⎜ ∣a − r∣ = RRRR⎜ 0 ⎟ ⎟−⎜ 2+λ RRR⎜ RRR⎝ 1 ⎠ ⎝ 1 + 2λ

R R ⎞RRRR RRRR⎛ −4 − 5λ ⎟RRRR = RRRR⎜ −2 − λ ⎟RR RR⎜ ⎠RRRR RRRR⎝ −2λ R R

R ⎞RRRR ⎟RRRR ⎟RR ⎠RRRR R

√ √ 2 2 2 = (−4 − 5λ) + (−2 − λ) + (−2λ) = 30λ2 + 44λ + 20.

Now look for the minimum point of 30λ2 + 44λ + 20: d set (30λ2 + 44λ + 20) = 60λ + 44 = 0 dλ

So b = (3, 2, 1) − 11/15(5, 1, 2) = 1/15(−10, 19, −7).

⇐⇒

λ=

−44 −11 = . 60 15

And the distance between a and l (or equivalently, the length of the line segment ab) is √ 30λ2 + 44λ + 20 =

√

−11 −11 2 ) + 44 ( ) + 20 = 30 ( 15 15

√

58 . 15

Of course, these are the same as what we found in Example 321 a few pages ago. Exercise 131. For each of the following, use the second method (calculus) to find (i) the distance between the given point a and the given line l; and also (ii) the point b on the line that is closest to a. (Answers on p. 1019.) (a) The point a = (7, 3, 4) and the line l described by r = (8, 3, 4) + λ(9, 3, 7).

(b) The point a = (8, 0, 2) and the line l described by r = (4, 4, 3) + λ(2, 7, 2).

(c) The point a = (8, 5, 9) and the line l described by r = (8, 4, 5) + λ(5, 6, 0).

Page 337, Table of Contents

www.EconsPhDTutor.com

33.2

Distance of a Point from a Plane

This is very much analogous to the distance of a point from a line. Definition 86. The foot of the perpendicular from a point a to a plane P is the point b on the plane P that is closest to the point a. The distance between the point a and the plane P is the length of the line segment ab.

a

Distance between a and b

Plane p b

Proposition 9. Given a point a (with position vector a) and a plane given in Hessian ˆ ˆ = d, normal form r ⋅ n ˆ ∣; and (a) The distance between the point and the plane is ∣dˆ − a ⋅ n

ˆ) n ˆ. (b) The foot of the perpendicular from the point to the plane is the point a + (dˆ − a ⋅ n Proof. Let b be the foot of the perpendicular from the point to the line.

(a) Pick any point p on the plane. The length of the line segment ab — and hence also the → on distance between the point and the plane — is simply the length of the projection of Ð ap →⋅n ˆ ∣ = ∣(p − a) ⋅ n ˆ ∣ = ∣d − a ⋅ n ˆ ∣, as desired. the plane’s normal vector, which is simply ∣Ð ap Ð → ˆ ∣ away from a, heading in the direction ab. Hence, (b) The point b is a distance ∣d − a ⋅ n ̂ Ð → ∗ ˆ ∣ ab. There are two possible cases to examine. b = a + ∣d − a ⋅ n

̂ Ð → Ð → →⋅n ˆ is pointing in the same direction as pb, then n ˆ = ab. Moreover Ð ˆ = Case #1 : If n ap ∗ ˆ > 0, so that ∣d − a ⋅ n ˆ∣ = d − a ⋅ n ˆ . Altogether then, = becomes b = a + (d − a ⋅ n ˆ) n ˆ , as d−a⋅n desired. ✓ ̂ Ð → Ð → →n ˆ and pb are pointing in opposite directions, then n ˆ = −ab. Moreover Ð ˆ = d− Case #2 : If n ap⋅ ∗ ˆ < 0, so that ∣d − a ⋅ n ˆ ∣ = − (d − a ⋅ n ˆ ). Altogether then, = becomes b = a−(d − a ⋅ n ˆ ) (−ˆ a⋅ n n) = ˆ) n ˆ , as desired. ✓ a + (d − a ⋅ n Page 338, Table of Contents

www.EconsPhDTutor.com

Example 324. Consider the point a = (1, 2, 3) and the plane described by the vector equation r ⋅ (1, 1, 1) = 3.

Convert the vector equation of the plane to Hessian normal form:

ˆ= So n

√ 3 1 1 1 r ⋅ ( √ , √ , √ ) = √ = 3. 3 3 3 3

√

√ √ 3 ˆ = 2 3. (1, 1, 1), dˆ = 3, and a ⋅ n 3

√ √ ˆ ∣ = ∣ 3 − 2 3∣ = Altogether then, the distance between the point and the plane is ∣dˆ − a ⋅ n √ 3 and the foot of the perpendicular is √ √ √ 3 ˆ) n ˆ = (1, 2, 3) + ( 3 − 2 3) a + (dˆ − a ⋅ n (1, 1, 1) = (0, 1, 2). 3

Ð → By the way, notice that in this example, n points in the opposite direction from ab. And ̂ Ð → ˆ < 0. so ab = −ˆ n. And moreover, dˆ − a ⋅ n

a = (1, 2, 3) Not to scale.

Plane p = (0, 1, 2)

Page 339, Table of Contents

Distance between a and b b = (0, 1, 2)

www.EconsPhDTutor.com

Example 325. Consider the point a = (0, 0, 0) and the plane described by the vector equation r ⋅ (1, 2, 3) = 32.

Convert the vector equation of the plane to Hessian normal form: 2 3 32 1 r ⋅ (√ , √ , √ ) = √ . 14 14 14 14

1 32 ˆ = √ (1, 2, 3), dˆ = √ , and a ⋅ n ˆ = 0. So n 14 14

32 ˆ ∣ = ∣ √ − 0∣ = Altogether then, the distance between the point and the plane is ∣dˆ − a ⋅ n 14 32 √ and the foot of the perpendicular is 14 32 1 16 ˆ) n ˆ = (0, 0, 0) + ( √ − 0) √ (1, 2, 3) = (1, 2, 3). a + (dˆ − a ⋅ n 7 14 14

Ð → By the way, notice that in this example, n points in the same direction as ab. And so ̂ Ð → ˆ . And moreover, dˆ − a ⋅ n ˆ > 0. ab = n

a = (0, 0, 0) Not to scale.

Distance between a and b

Plane p = (4, 5, 6)

Page 340, Table of Contents

b=

(1, 2, 3)

www.EconsPhDTutor.com

Exercise 132. For each of the following, find (i) the distance between the given point a and the given plane P ; and also (ii) the point b on the line that is closest to a. (Answers on pp. 1020, 1021, and 1022.) (a) a = (7, 3, 4), P ∶ r ⋅ (9, 3, 7) = 109. (b) a = (8, 0, 2), P ∶ r ⋅ (2, 7, 2) = 42. (c) a = (8, 5, 9), P ∶ r ⋅ (5, 6, 0) = 64.

Page 341, Table of Contents

www.EconsPhDTutor.com

34 34.1

Angles

Angle between Two Lines (2D)

Consider two lines on the 2D cartesian plane that are parallel (and thus either do not intersect or are identical). We define the angle between them to be 0.

We define the angle between two parallel lines to be 0.

Now consider two lines that intersect (see diagram below). Taking their intersection point to be the vertex, A and B are, respectively, the acute and obtuse angles between the two lines. Of course, there is the possibility that the two lines are perpendicular, in which case A and B are both right (i.e. equal to π/2).

So when talking about “the angle between two lines”, there is some potential for confusion. Are we talking about angle A or angle B? By convention, the angle between two lines is the smaller angle. (Also, on the A-level exams, they are usually quite careful to specifying that they want the acute angle, so that there is no confusion.) Page 342, Table of Contents

www.EconsPhDTutor.com

If we have the direction vectors of the lines, then we can simply use what we learnt about the scalar product to compute the angle between them. Example 326. Consider the lines (on the 2D cartesian plane) r = (1, 3) + λ(2, 1) and r = (−1, −1) + λ(1, 3) (λ ∈ R). The angle θ between their direction vectors v1 = (2, 1) and v2 = (1, 3) is given by θ = cos−1 (

(2, 1) ⋅ (1, 3) 5 v1 ⋅ v2 ) = cos−1 ( ) = cos−1 ( √ √ ) ≈ 0.785. ∣v1 ∣ ∣v2 ∣ ∣(2, 1)∣ ∣(1, 3)∣ 5 10

So the acute angle between the two lines is 0.785.

y 4 A = 0.785 Vector equation r = (1, 3) + ɉ(2, 1) 2

x 0 -4

-2

Vector equation r = (-1, -3) + ɉ(1, 3)

0

2 The vector (1, 3)

4

-2 A = 0.785 The vector (2, 1) -4

Page 343, Table of Contents

www.EconsPhDTutor.com

Example 327. Consider the lines (on the 2D cartesian plane) r = (0, 0) + λ(−2, 3) and r = (1, 0) + λ(3, 1) (λ ∈ R). The angle θ between their direction vectors v1 = (−2, 3) and v2 = (3, 1) is given by θ = cos−1 (

v1 ⋅ v2 (−2, 3) ⋅ (3, 1) −3 ) = cos−1 ( ) = cos−1 ( √ √ ) ≈ 1.837. ∣v1 ∣ ∣v2 ∣ ∣(−2, 3)∣ ∣(3, 1)∣ 13 10

This is the obtuse angle between the two lines. So the acute angle between the two lines is A = π − 1.837 = 1.305.

y 4

The vector (-2, 3) B = 1.837

Vector equation r = (0, 0) + ɉ(-2, 3)

The vector (3, 1)

2 A = 1.305 B = 1.837 x 0

-4

-2

0

2

4

A = 1.305 Vector equation r = (1, 0) + ɉ(3, 1)

-2

-4

Page 344, Table of Contents

www.EconsPhDTutor.com

Example 328. Consider the lines (on the the 2D 2D cartesian cartesian plane) plane) r == (2, (2, −2) −2) ++ λ(3, λ(3, 3) 3) and and r = (1, 1) + λ(−1, −1) (λ ∈ R). The angle θ between their direction vectors v11 == (3, (3, 3) 3) and and v22 = (−1, −1) is given by −1 ( θ = cos−1

−6 v11 ⋅ v22 (3, 3) ⋅ (−1, −1) −6 −1 −6 −1 −1 ( ) = cos−1 π. ) = cos−1 ( √ √ ) = cos−1 ( )) == π. ∣v11∣ ∣v22∣ ∣(3, 3)∣ ∣(−1, −1)∣ 66 18 2

So the two vectors are parallel. Which means that the two lines are parallel parallel and and so so by by definition, the angle between the two lines is 0.

Page 345, 345, Table of Contents Page

www.EconsPhDTutor.com www.EconsPhDTutor.com

Exercise 133. Find the acute angle between each of the following pairs of lines. (Answer on p. 1023.) (a) r = (−1, 2) + λ(−1, 1) and r = (0, 0) + λ(2, −3) (λ ∈ R).

(b) r = (−1, 2) + λ(1, 5) and r = (0, 0) + λ(8, 1) (λ ∈ R). (c) r = (−1, 2) + λ(2, 6) and r = (0, 0) + λ(3, 2) (λ ∈ R).

Page 346, Table of Contents

www.EconsPhDTutor.com

34.2

Angle between Two Lines (3D)

Visualising lines in 3D space is difficult. Which is why we tackled the 2D case first. It turns out that we compute angles between two lines in 3D space in exactly the same way as in the 2D case. 1. If two lines are parallel, then again we define the angle between them to be 0. 2. If two lines intersect, then again we take their intersection point to be the vertex and take the smaller angle formed to be the angle between the two lines. On the 2D cartesian plane, the above were the only two possibilities — two lines either are parallel or intersect. In contrast, in 3D space, there is the third possibility that two lines neither are parallel nor intersect! As we’ll learn in section 35.1, any two lines that neither are parallel nor intersect are called skew lines. What is the angle between two skew lines, given that they do not intersect? 3. Given two skew lines, translate one of them so that they intersect. Examine the angle between the two now-intersecting lines. This is defined to be the angle between the two skew lines. The next example illustrates.

Page 347, Table of Contents

www.EconsPhDTutor.com

Example 329. Below, the red line and pink line are skew lines, i.e., they neither intersect nor are parallel. To find the angle between them, translate the red line upwards so that the new red dotted line intersects the pink line at the purple dot. The angle A is then defined to be the angle between the two skew lines.

Skew lines (lines that neither intersect nor are parallel) y Translate one of the lines so that they intersect

3

A

2 1 x 1 1

2 3 z

4

So once again, given any two lines, the angle between them is simply the angle between their direction vectors. So again the scalar product comes in handy.

Page 348, Table of Contents

www.EconsPhDTutor.com

Example 330. Consider the lines r = (0, 1, 2)+λ(9, 1, 3) and r = (4, 5, 6)+λ(3, 2, 1) (λ ∈ R). The angle θ between their direction vectors v1 = (9, 1, 3) and v2 = (3, 2, 1) is given by θ = cos−1 (

(9, 1, 3) ⋅ (3, 2, 1) 32 v1 ⋅ v2 ) = cos−1 ( ) = cos−1 ( √ √ ) ≈ 0.459. ∣v1 ∣ ∣v2 ∣ ∣(9, 1, 3)∣ ∣(3, 2, 1)∣ 91 14

So the acute angle between the two lines is 0.459.

Example 331. Consider the lines r = (−1, 2, 3) + λ(0, 1, 0) and r = (0, 0, 0) + λ(8, −3, 5) (λ ∈ R). The angle θ between their direction vectors v1 = (0, 1, 0) and v2 = (8, −3, 5). Thus, θ = cos−1 (

(0, 1, 0) ⋅ (8, −3, 5) −3 v1 ⋅ v2 ) = cos−1 ( ) = cos−1 ( √ √ ) ≈ 1.879. ∣v1 ∣ ∣v2 ∣ ∣(0, 1, 0)∣ ∣(8, −3, 5)∣ 1 98

So the obtuse angle between the two lines is 1.879. And the angle between the two lines is 1.263. Example 332. Consider the lines r = (1, 3, 3) + λ(1, 5, 3) and r = (7, 4, 7) + λ(7, −2, 1) (λ ∈ R). The angle θ between their direction vectors v1 = (1, 5, 3) and v2 = (7, −2, 1). Thus, θ = cos−1 (

v1 ⋅ v2 (1, 5, 3) ⋅ (7, −2, 1) ) = cos−1 ( ) = cos−1 (0) ≈ 0.5π. ∣v1 ∣ ∣v2 ∣ ∣(1, 5, 3)∣ ∣(7, −2, 1)∣

So the two lines are perpendicular and the angle between them is right (i.e. π/2).

Exercise 134. Find the angle between each of the following pairs of lines. (Answer on p. 1023.) (a) r = (−1, 2, 3) + λ(−1, 1, 0) and r = (0, 0, 0) + λ(2, −3, 4) (λ ∈ R).

(b) r = (−1, 2, 3) + λ(1, 5, 6) and r = (0, 0, 0) + λ(8, 1, 1) (λ ∈ R).

(c) r = (−1, 2, 3) + λ(2, 6, 7) and r = (0, 0, 0) + λ(3, 2, 1) (λ ∈ R).

Page 349, Table of Contents

www.EconsPhDTutor.com

34.3 34.3

Angle A Line Line and and aa Plane Plane Angle between between A

Fact + λv λv and and the the plane plane rr⋅⋅nn == dd isis Fact 38. 38. The The angle angle between between the the line line rr = = pp + v ⋅⋅ n n −1 v −1 A = sin ∣ ∣. A = sin ∣ ∣v∣ ∣n∣ ∣n∣∣ . ∣v∣

Proof. direction vector vector and and the the plane’s plane’snormal normalvector. vector. Proof. Let Let θθ be be the the angle angle between between the the line’s line’s direction θθ satisfies satisfies cos cosθθ == vv ⋅⋅ n/ n/(∣v∣ (∣v∣ ∣n∣). ∣n∣).

If θ is acute (or right), then the angle between the line and the plane is A = π/2 − θ. Thus, If θ is acute (or right), then the angle between the line and the plane is A = π/2 − θ. Thus, ππ vv⋅⋅nn π ππ − θ) = sin ( π sin A = sin ( ) cos θ − sin θ cos ( ) = cos θ = sin A = sin ( 2 − θ) = sin ( 2 ) cos θ − sin θ cos (2 ) = cos θ = ∣v∣ ∣n∣ .. 2 2 2 ∣v∣ ∣n∣

Note that that ifif θθ ∈∈ (0, (0,π/2], π/2], then then v v ⋅⋅ n n ≥≥ 0, 0, so so that Note that ∣v ∣v ⋅⋅ n∣ n∣ == vv ⋅⋅ n. n. Altogether, Altogether, we we indeed indeedhave have v ⋅⋅ n n v sin A = ∣ sin A = ∣ ∣v∣ ∣n∣ ∣∣ ∣v∣ ∣n∣

⋅n −1 v or A = sin −1 ∣ v ⋅ n ∣ . or A = sin ∣∣v∣ ∣n∣ ∣ . ∣v∣ ∣n∣

obtuse (or (or straight), straight), then then A A == θθ − − 0.5π. 0.5π. And IfIf θθ isis obtuse And so so

π π π ) − sin ( ππ) cos θ = − cos θ = −v −v⋅ ⋅nn. sinA A == sin sin(θ (θ −− π )) == sin sin θθ cos cos ( sin ( ) − sin ( ) cos θ = − cos θ = . 22 22 ∣v∣ 22 ∣v∣∣n∣ ∣n∣

Note that that ifif θθ ∈∈ (π/2, (π/2,π], π], then then v v ⋅⋅ n n< < 0, 0, so so that Note that ∣v ∣v ⋅⋅ n∣ n∣ == −v −v ⋅⋅ n. n. Altogether, Altogether, we we indeed indeedhave have v⋅n sin A A == ∣∣ v ⋅ n ∣∣ sin ∣v∣ ∣n∣ ∣n∣ ∣v∣

Page 350, Table of Contents

v ⋅⋅ nn −1 v or or A A == sin sin−1 ∣∣∣v∣ ∣n∣ ∣∣.. ∣v∣ ∣n∣

www.EconsPhDTutor.com

Example 333. The angle between the line r = (0, 1, 2) + λ(9, 1, 3) (λ ∈ R) and the plane r ⋅ (1, 1, 1) = 3 is √ 13 v ⋅ n (9, 1, 3) ⋅ (1, 1, 1) 13 sin−1 ∣ ∣ = sin−1 ∣ ∣ = sin−1 ∣ √ √ ∣ = sin−1 ∣ √ √ ∣ ≈ 0.906. ∣v∣ ∣n∣ ∣(9, 1, 3)∣ ∣(1, 1, 1)∣ 91 3 7 3

Example 334. The angle between the line r = (4, 2, 3) + λ(1, 0, 1) (λ ∈ R) and the plane r ⋅ (−1, −1, 0) = 5 is sin−1 ∣

(1, 0, 1) ⋅ (−1, −1, 0) −1 v⋅n ∣ = sin−1 ∣ ∣ = sin−1 ∣ √ √ ∣ = sin−1 (1/2) = π/6. ∣v∣ ∣n∣ ∣(1, 0, 1)∣ ∣(−1, −1, 0)∣ 2 2

Example 335. The angle between the line r = (5, 5, 5) + λ(1, 0, 1) (λ ∈ R) and the plane r ⋅ (0, 1, 0) = 3 is sin−1 ∣

v⋅n (1, 0, 1) ⋅ (0, 1, 0) ∣ = sin−1 ∣ ∣ = sin−1 ∣0∣ = 0. ∣v∣ ∣n∣ ∣(1, 0, 1)∣ ∣(0, 1, 0)∣

Exercise 135. For each of the following, find the angle between the given line and plane. (Answer on p. 1024.) (a) r = (−1, 2, 3) + λ(−1, 1, 0) (λ ∈ R) and r ⋅ (3, 4, 5) = 0.

(b) r = (−1, 2, 3) + λ(0, 2, 6) (λ ∈ R) and r ⋅ (1, 3, 5) = 2. (c) r = (−1, 2, 3) + λ(1, 9, 8) (λ ∈ R) and r ⋅ (2, 8, 2) = 3.

Page 351, Table of Contents

www.EconsPhDTutor.com

34.4

Angle between Two Planes

Given two planes P1 and P2 , the angle between them is simply the angle between any two vectors v1 and v2 on the two planes.

n1 v2

n2

Angle between the two planes’ normal vectors

Angle between the two planes

P2

v1

P1

But the normal vector n1 of the first plane is orthogonal to v1 ; similarly, the normal vector n2 of the second plane is orthogonal to v2 . And so the angle between v1 and v2 is equal to the angle between n1 and n2 . Altogether then, the angle between two planes is simply the angle between their normal vectors. Again, there are two possible angles — by convention, we take the smaller one.

Page 352, Table of Contents

www.EconsPhDTutor.com

Example 336. Consider the planes r ⋅ (1, 1, 1) = 12 and r ⋅ (−1, −1, 0) = −1. The angle θ between the two planes is: θ = cos−1 ( = cos−1 (

n1 ⋅ n2 ) ∣n1 ∣ ∣n2 ∣

(1, 1, 1) ⋅ (−1, −1, 0) ) ∣(1, 1, 1)∣ ∣(−1, −1, 0)∣

−2 −2 = cos−1 ( √ √ ) = cos−1 ( √ ) 3 2 6 ≈ 2.526.

This is the obtuse angle. So the acute angle between the two planes is π − 2.526 = 0.615 radian. Example 337. Consider the planes r ⋅ (2, 1, 3) = 26 and r ⋅ (−3, 0, 5) = −25. The angle θ between the two planes is θ = cos−1 ( = cos−1 (

n1 ⋅ n2 ) ∣n1 ∣ ∣n2 ∣

(2, 1, 3) ⋅ (−3, 0, 5) ) ∣(2, 1, 3)∣ ∣(−3, 0, 5)∣

9 = cos−1 ( √ √ ) ≈ 1.146. 14 34

Exercise 136. Find the angle between the two given planes. (a) r ⋅ (−1, −2, −3) = 1 and r⋅(3, 4, 5) = 2. (b) r⋅(1, −2, 3) = 3 and r⋅(5, 1, 1) = 4. (c) r⋅(1, 1, 8) = 5 and r⋅(−3, 0, 10) = 6. (Answer on p. 1025.)

Page 353, Table of Contents

www.EconsPhDTutor.com

35

Relationships between Lines and Planes 35.1

Relationship between Two Lines

Definition 87. Two lines are parallel if their direction vectors can be written as scalar multiples of each other.

Example 338. The lines r = (0, 0, 0) + λ(0, 1, 0) and r = (4, 17, 0) + λ(1, 0, 0) (λ ∈ R) are not parallel, because (0, 1, 0) cannot be written as a scalar multiple of (1, 0, 0).

Example 339. The lines r = (8, 1, 1) + λ(3, 6, 9) and r = (4, 5, 6) + λ(1, 2, 3) (λ ∈ R) are parallel, because (3, 6, 9) = 3(1, 2, 3).

Page 354, Table of Contents

www.EconsPhDTutor.com

Definition 88. A set of points are coplanar if there is some plane that contains all of these points.

Any two points are always coplanar — indeed, they are collinear (p1 and p2 in the figure below). Three points are also always coplanar, although they may not be collinear (p1 , p2 , and p3 in the figure below). But four points may not be coplanar (p1 , p2 , p3 , and p4 in the figure below).

Line 2 Two points are coplanar. They also lie on the same line.

Three points are coplanar, though not necessarily on the same line.

p1 Plane

p3 p2 Line 1

p4

Four points may not be coplanar.

Definition 89. Two lines are coplanar if there is some plane on which both lie. Two lines that are not coplanar are called skew lines.

Example 340. In the figure above, Line 1 and Line 2 are skew lines. Line 1 lies on the plane illustrated. Line 2 cuts through the plane and does not intersect Line 1.

Page 355, Table of Contents

www.EconsPhDTutor.com

How do we determine whether two lines l1 and l2 are coplanar or skew? Well, 1. If they are parallel, then obviously we can construct a plane that contains both lines. And so the two lines are coplanar. 2. If they are not parallel and they lie on the same plane, then they must intersect. This is just the familiar fact you learnt in primary school — two non-parallel lines on the plane must definitely intersect. Altogether we conclude: Fact 39. Two lines are coplanar if and only if they (i) are parallel; OR (ii) intersect. Equivalently, two lines are skew if and only if they (i) are not parallel; AND (ii) do not intersect.

Example 341. Consider the lines r = (8, 1, 1)+λ(3, 6, 9) and r = (4, 5, 6)+λ(1, 2, 3) (λ ∈ R). The direction vector of one can be written as the scalar multiple of the other, so they are parallel. Hence, they are also coplanar; or equivalently, they are not skew. Example 342. Consider the lines r = (0, 0, 0) + λ(0, 1, 0) and r = (4, 17, 0) + λ(1, 0, 0) (λ ∈ R). The direction vector of one cannot be written as the scalar multiple of the other, so they are not parallel. If they intersect, then there are reals α and β such that (0, 0, 0) + α(0, 1, 0) = (4, 17, 0) + β(1, 0, 0), or 0 = 4 + β, α = 17, and 0 = 0.

α = 17, β = −4 solves the above equations. (What does this mean? This means that the first line goes through the point (0, 0, 0) + α(0, 1, 0) = (0, 17, 0) and the second line also goes through the same point (4, 17, 0) + β(1, 0, 0) = (0, 17, 0).)

The two lines intersect at (0, 17, 0). And so they are coplanar — or equivalently, they are not skew.

If we’d like, we can easily find the plane on which these two lines lie. Remember: All we need are two distinct vectors and a point to determine a plane. We already have two distinct vectors, namely the direction vectors of the two lines. Using these, we can find a normal vector for the plane — namely (0, 1, 0) × (1, 0, 0) = (0, 0, −1). Noting also that the origin is on the first line and therefore on the plane, we conclude that the plane is r ⋅ (0, 0, −1) = 0. Page 356, Table of Contents

www.EconsPhDTutor.com

Example 343. Consider the lines r = (0, 1, 2)+λ(9, 1, 3) and r = (4, 5, 6)+λ(3, 2, 1) (λ ∈ R). They are not parallel. Let’s see if they have an intersection point. If they intersect, then there are reals α and β such that (0, 1, 2) + α(9, 1, 3) = (4, 5, 6) + β(3, 2, 1), or 9α = 4 + 3β, 1 + α = 5 + 2β, and 2 + 3α = 6 + β. 1

2

3

Take 2× = minus = to get (4 + 6α) − (1 + α) = (12 + 2β) − (5 + 2β) or 3 + 5α = 7 or α = 0.8. 2 Now from =, this means that β = −1.6. These do not work if we try plugging them into 1 =. Hence, there are no reals α and β that solve the above system of equations. In other words, the two lines do not intersect. 3

2

And so the two lines are not coplanar — or equivalently, they are skew.

Exercise 137. Determine whether each of the following pairs of lines is coplanar or skew. If they are coplanar, find the plane that contains both of them. (Answer on p. 1026.) (a) r = (8, 1, 5) + λ(3, 2, 1) and r = (1, 2, 3) + λ(5, 6, 7) (λ ∈ R).

(b) r = (0, 0, 6) + λ(3, 9, 0) and r = (1, 1, 1) + λ(1, 3, 0) (λ ∈ R). (c) r = (6, 5, 5) + λ(1, 0, 1) and r = (8, 3, 6) + λ(0, 1, 1) (λ ∈ R).

Page 357, Table of Contents

www.EconsPhDTutor.com

35.2

Relationship between a Line and a Plane

Definition 90. A line with direction vector v and a plane with normal vector n are parallel if v ⋅ n = 0 (i.e. v and n are perpendicular).

The above definition makes sense, because if the line is perpendicular to the plane’s normal vector, then the line must be parallel to the plane itself.

Fact 40. Given a plane and a line, there are three possible cases (illustrated below): 1. The line and plane are parallel and do not intersect at all. 2. The line and plane are parallel and the line lies completely on the plane. 3. The line and plane are not parallel and intersect at exactly one point.

Line 1

Line 3

Plane Line 2

Proof. Optional, see p. 865 in the Appendices. Note that if a line and a plane are parallel, then either (i) they do not intersect at all; or (ii) the line lies completely on the plane. • So if a line and a plane are parallel and you can prove that they share at least one intersection point, then it must be that the line lies completely on the plane. • Conversely, if a line and a plane are parallel and you can prove that there is at least one point on the line that is not on the plane (or that there is at least one point on the plane that is not on the line), then it must be that they do not intersect at all. Page 358, Table of Contents

www.EconsPhDTutor.com

Example 344. Consider the line r = (3, 5, 5)+λ(9, 1, 3) (λ ∈ R) and the plane r⋅(1, 1, 1) = 3. We have (9, 1, 3) ⋅ (1, 1, 1) = 13 ≠ 0 and so they are not parallel. They must therefore intersect at exactly one point. Let’s find it.

Plug in a generic point of the line into the equation for the plane: [(3, 5, 5) + λ(9, 1, 3)]⋅(1, 1, 1) = 3 Ô⇒ 3+9λ+5+λ+5+3λ = 3 Ô⇒ 13+13λ = 3 Ô⇒ λ =

So the intersection point is (3, 5, 5) −

10 (9, 1, 3). 13

−10 . 13

Example 345. Consider the line r = (3, 5, 5) + λ(9, 1, 3) (λ ∈ R) and the plane r ⋅ (1, 0, −3) = −6. We have (9, 1, 3) ⋅ (1, 0, −3) = 0 and so they are parallel. There are two possibilities. Either they do not intersect at all OR the line lie completely on the plane.

Since (3, 5, 5) ⋅ (1, 0, −3) = −12 ≠ −6, the point (3, 5, 5) is on the line but is not on the plane

And so the line and plane do not intersect at all.

Example 346. Consider the line r = (3, 5, 3) + λ(9, 1, 3) (λ ∈ R) and the plane r ⋅ (1, 0, −3) = −6. We have (9, 1, 3) ⋅ (1, 0, −3) = 0 and so they are parallel. There are two possibilities. Either they do not intersect at all OR the line lie completely on the plane. Since (3, 5, 3) ⋅ (1, 0, −3) = −6, the point (3, 5, 3) on the line is also on the plane.

Since they are parallel and share at least one intersection point, it must be that the line lies completely on the plane.

Exercise 138. For each of the following, determine whether the given line and plane are (i) parallel but do not intersect; (ii) parallel with the line lying completely on the plane; or (iii) intersect at exactly one point.(Answer on p. 1027.) (a) r = (4, 5, 6) + λ(2, 3, 5) (λ ∈ R) and r ⋅ (−10, 0, 4) = −26.

(b) r = (5, 5, 6) + λ(2, 3, 5) (λ ∈ R) and r ⋅ (−10, 0, 4) = −26. (c) r = (4, 5, 6) + λ(2, 3, 5) (λ ∈ R) and r ⋅ (−10, 0, 3) = −26.

Page 359, Table of Contents

www.EconsPhDTutor.com

35.3

Relationship between Two Planes

Definition 91. Two planes are parallel if their normal vectors can be written as scalar multiples of each other. (Note that an alternative definition is this: “Two planes are parallel if they do not intersect.” We will show that these two definitions are equivalent.) Imagine that two planes intersect at some line, which we’ll call the intersection line. Since this intersection line is on both planes, it must also be perpendicular to the normal vectors of both planes. In other words, it must have direction vector n1 × n2 . The next fact is thus not surprising (although actually proving it takes a little work). Fact 41. Two non-parallel planes with normal vectors n1 and n2 intersect at all if and only if they intersect along a line with direction vector n1 × n2 (i.e. the line is perpendicular to both n1 and n2 ). Proof. Optional, see p. 866 in the Appendices.

Fact 42. Given two planes, there are three possible cases: 1. The two planes are parallel and exactly identical. 2. The two planes are parallel and do not intersect at all. 3. The two planes are not parallel and share an intersection line with direction vector n1 × n2 (where n1 , n2 are the normal vectors of the plane). Proof. Optional, see p. 868 in the Appendices.

Page 360, Table of Contents

www.EconsPhDTutor.com

Example 347. Planes P1 and P2 are parallel and do not intersect at all. Planes P2 and P3 are not parallel and share an intersection line with direction vector n2 × n3 .

n2

P3

n3

P2 Intersection line of P2 and P3

n2 × n3

P1

Note that analogous to our study of two lines, if two planes are parallel, then either (i) they do not intersect at all; or (ii) they are identical. • So if two planes are parallel and you can prove that they share at least one intersection point, then it must be that the two planes are identical. • Conversely, if two planes are parallel and you can prove that there is at least one point on one plane that is not on the other plane, then it must be that they do not intersect at all. And in the case where they are not parallel, to find the intersection line, simply find a point p where the two planes intersect. Then the intersection line is simply r = p + λ(n1 × n2 ),

Page 361, Table of Contents

λ ∈ R.

www.EconsPhDTutor.com

Example 348. Consider the planes r ⋅ (7, 1, 1) = 42 and r ⋅ (1, 1, 2) = 6.

Clearly, (7, 1, 1) cannot be written as a scalar multiple of (1, 1, 2). So the two planes are not parallel and share an intersection line whose direction vector is (7, 1, 1) × (1, 1, 2) = (1, −13, 6). Find a point p = (x, y, z) where the two planes intersect: 7x + y + z = 42, 1

x + y + 2z = 6. 2

There are infinitely many points where the two planes intersect. So why not we look for an intersection point where x = 0. I’ll call this the “plug in x = 0” trick. In which case = minus = yields z = −36 and y = 78. Hence, the intersection line is r = (0, 78, −36) + λ(1, −13, 6) (λ ∈ R). 2

1

Example 349. Consider the planes r ⋅ (1, 1, 1) = 12 and r ⋅ (−1, −1, 0) = −1.

Clearly, (1, 1, 1) cannot be written as a scalar multiple of (−1, −1, 0). So the two planes are not parallel and share an intersection line whose direction vector is (1, 1, 1) × (−1, −1, 0) = (1, −1, 0). Find a point p = (x, y, z) where the two planes intersect: x + y + z = 12, 1

−x − y = −1. 2

Again, we can play the “plug in x = 0” trick. In which case = says that y = 1 and now from 1 =, we have z = 11. And so (0, 1, 11) is an intersection point of the two planes. Hence, the intersection line is r = (0, 1, 11) + λ(1, −1, 0) (λ ∈ R).

Page 362, Table of Contents

2

www.EconsPhDTutor.com

You can use the “plug in x = 0” trick whenever the intersection line has direction vector with a x-coordinate that is not equal to 0.

But the “plug in x = 0” trick may not work if the intersection line has direction vector with x-coordinate equal to 0. Example 350. Consider the planes r ⋅ (0, 1, 3) = 0 and r ⋅ (−1, 1, 3) = 2.

Clearly, (0, 1, 3) cannot be written as a scalar multiple of (−1, 0, 5). So the two planes are not parallel and share an intersection line whose direction vector is (0, 1, 3) × (−1, 1, 3) = (0, −3, 1).

Find a point p = (x, y, z) where the two planes intersect: y + 3z = 0, 1

−x + y + 3z = 2. 2

Here the direction vector of the intersection line has x-coordinate 0. So the “plug in x = 0” 1 2 trick might not work. And indeed it doesn’t, because if we plug in x = 0, then = and = are contradictory.

So let’s try the “plug in y = 0” trick instead, which I know will work because the ycoordinate of the direction vector of the intersection line is non-zero (it’s −3). Then from 1 2 = we have z = 0 and now from = we have x = −2. And so (−2, 0, 0) is an intersection point of the two planes. Hence, the intersection line is r = (−2, 0, 0) + λ(0, −3, 1),

λ ∈ R.

r = (−2, 0, 0) + λ(0, −3, 1),

λ ∈ R.

Alternatively, we could also have used the “plug in z = 0” trick instead, which again I know will work because the z-coordinate of the direction vector of the intersection line is 1 2 non-zero (it’s 1). Then from = we have y = 0 and now from = we have x = −2. And so again we find that (−2, 0, 0) is an intersection point of the two planes. And so again we would have concluded that the intersection line is

Page 363, Table of Contents

www.EconsPhDTutor.com

Example 351. Consider the planes r ⋅ (4, 0, 3) = 5 and r ⋅ (−8, 0, −6) = −64.

Clearly, (4, 0, 3) can be written as a scalar multiple of (−8, 0, −6) and so the two planes are parallel.

5 Let’s check if they are identical. The point ( , 0, 0) is on the first plane, but not the second, 4 because: 5 ( , 0, 0) ⋅ (−8, 0, −6) = −10 ≠ −64. 4

So the two planes are not identical and do not intersect at all.

Example 352. Consider the planes r ⋅ (4, 0, 3) = 32 and r ⋅ (−8, 0, −6) = −64.

Clearly, (4, 0, 3) can be written as a scalar multiple of (−8, 0, −6) and so the two planes are parallel.

Let’s check if they are identical. The point (8, 0, 0) is on the first plane. It is also on the second plane because: (8, 0, 0) ⋅ (−8, 0, −6) = −64.

Since the two planes are parallel and share at least one intersection point, it must be that the two planes are exactly identical.

Exercise 139. For each of the following, determine whether the given pair of planes are parallel and identical, parallel and do not intersect, or are not parallel. If they are not parallel, determine also their intersection line. (Answer on p. 1028.) (a) r ⋅ (4, 9, 3) = 61 and r ⋅ (1, 1, 2) = 19.

(b) r ⋅ (1, 1, 0) = 4 and r ⋅ (1, 6, 8) = 60.

(c) r ⋅ (4, 4, 8) = 56 and r ⋅ (1, 1, 2) = 12.

(d) r ⋅ (4, 4, 8) = 48 and r ⋅ (1, 1, 2) = 12.

Page 364, Table of Contents

www.EconsPhDTutor.com

35.4

Relationship between Three Planes SYLLABUS ALERT

The relationship between three planes is included in the 9740 (old) syllabus, but not in the 9758 (revised) syllabus. So you can skip this section if you’re taking 9758. Given 3 planes P1 , P2 , and P3 , we can form 3 pairs of planes: • P1 and P2 ; • P1 and P3 ; and • P2 and P3 . To find the relationship between the 3 planes is simply to find the relationships between each of these 3 pairs of planes. This can be insanely tedious, but there is nothing new here. Everything follows from what you learnt in the previous sections. Let’s nonetheless give a summary of the possibilities. Given three planes, we have 3 possible cases, each of which can be broken up into several sub-cases, for a total of 8 distinct possibilities.

Page 365, Table of Contents

www.EconsPhDTutor.com

1. All 3 planes are parallel to each other. There are 3 possible sub-cases: (a) All 3 planes are identical. (b) Only 2 planes are identical. (c) No 2 planes are identical.

P1 , P2 , P3 3 parallel, identical planes

P3

P2 P3 P1 P1 , P2

3 parallel, nonintersecting planes

3 parallel planes, where P1 and P2 are identical

Page 366, Table of Contents

www.EconsPhDTutor.com

2. Two planes are parallel to each other, but the 3rd plane is not parallel to either of the first 2 planes. There are 2 possible sub-cases: (a) The first 2 planes are identical. And so here we are really back to the situation of two non-parallel planes, which we already covered in detail in the previous section. They intersect along a line. (b) The first 2 planes are not identical. And so the non-parallel plane intersects each of the other two planes along a separate line of intersection.

P3

P1 and P2 are parallel and identical. P3 intersects both at the same line. P1 , P2

P3 P2

P1 and P2 are parallel but non-identical. P3 intersects both at separate lines.

Page 367, Table of Contents

P1

www.EconsPhDTutor.com

3. No 2 planes are parallel. Each pair of planes intersects along a line. There are thus three intersection lines (though possibly some may be identical). It is possible to prove (but we won’t do so in this book) that there are only 3 possible sub-cases: (a) None of the intersection lines intersect with each other. That is, each pair of planes simply intersects along some distinct intersection line. (b) All 3 intersection lines are identical. So all 3 planes intersect along the same intersection line. (c) The 3 intersection lines and thus all 3 planes intersect at a single point. To determine which of the above sub-cases we’re in, we must determine the relation between each pair of intersection lines. This is tedious, but nothing new.

P3

3 non-parallel planes that intersect at different lines P2

P1

3 non-parallel planes that intersect at the same line P3 P2

P1 P3 P1

P2

Page 368, Table of Contents

3 non-parallel planes that intersect at only one point

www.EconsPhDTutor.com

Example 353. Consider the planes P1 , P2 , and P3 , given by r ⋅ (1, 0, 0) = 0, r ⋅ (0, 1, 0) = 0, and r ⋅ (0, 0, 1) = 0. Step #1. Check if any two planes are parallel.

By observation, no plane’s normal vector can be written as a scalar multiple of another plane’s normal vector. So no two planes are parallel. (So we are in Case 3.) Step #2. Find the 3 intersection lines along which each pair of planes intersect. By observation, all three planes contain the origin. The planes P1 and P2 share an intersection line with direction vector (1, 0, 0) × (0, 1, 0) = (0, 0, 1) and so their intersection line is r = (0, 0, 0) + λ(0, 0, 1) (λ ∈ R). Call this line l1 . The planes P1 and P3 share an intersection line with direction vector (1, 0, 0) × (0, 0, 1) = (0, −1, 0) and so their intersection line is r = (0, 0, 0) + λ(0, −1, 0) (λ ∈ R). Call this line l2 .

The planes P2 and P3 share an intersection line with direction vector (0, 1, 0) × (0, 0, 1) = (1, 0, 0) and so their intersection line is r = (0, 0, 0) + λ(1, 0, 0) (λ ∈ R). Call this line l3 .

Step #3. Determine where, if at all, the 3 intersection lines intersect.

l1 and l2 are not parallel, but they do intersect at the point (0, 0, 0) and so that is also their only intersection point.

l1 and l3 are not parallel, but they do intersect at the point (0, 0, 0) and so that is also their only intersection point.

l2 and l3 are not parallel, but they do intersect at the point (0, 0, 0) and so that is also their only intersection point. Conclusion.

Altogether, we conclude that the 3 intersection lines intersect at a single point. Hence, the 3 planes also intersect at a single point. (So we are in Case 3c.)

Page 369, Table of Contents

www.EconsPhDTutor.com

Example 354. Consider the planes P1 , P2 , and P3 , given by r⋅(1, 1, 0) = 1, r⋅(−2, −2, 0) = −4, and r ⋅ (0, 1, 1) = 1. Step #1. Check if any two planes are parallel.

By observation, P1 ’s normal vector (1, 1, 0) can be written as a scalar multiple of P2 ’s normal vector (−2, −2, 0). And so these two planes are parallel.

P3 is not parallel to either of the first two planes, since (0, 1, 1) cannot be written as a scalar multiple of (1, 1, 0) or (−2, −2, 0). Altogether then, we are in Case 2.

Step #2. Check if the two parallel planes are identical. They are not, because (1, 0, 0) is on P1 but is not on P2 , as we can easily verify — (1, 0, 0) ⋅ (−2, −2, 0) = −2 ≠ −4. So we are in Case 2b.

Step #3. Find the intersection lines. There are two — one shared by P1 and P3 and the other shared by P2 and P3 . (P1 and P2 are distinct, parallel planes and thus do not intersect at all.) The intersection line of P1 and P3 has direction vector (1, 1, 0) × (0, 1, 1) = (1, −1, 1). Let’s 1 find a point (x, y, z) at P1 and P3 intersect: the equations for the planes are x + y = 1 and 2 y + z = 1.

Using the “plug in x = 0” trick, we see that they intersect at (0, 1, 0). Hence, their intersection line is r = (0, 1, 0) + λ(1, −1, 1) (λ ∈ R). Call this line l1 .

The intersection line of P2 and P3 must also have direction vector (1, −1, 1). Let’s find a 1 point (x, y, z) at P2 and P3 intersect: the equations for the planes are −2x − 2y = −4 and 2 y + z = 1.

Using the “plug in x = 0” trick, we see that they intersect at (0, 2, −1). Hence, their intersection line is r = (0, 2, −1) + λ(1, −1, 1) (λ ∈ R). Call this line l2 . The lines l1 and l2 are parallel.

Exercise 140. What is the relationship between the 3 planes P1 , P2 , and P3 , given by r ⋅ (1, 0, 1) = 1, r ⋅ (0, 1, −1) = −1, and r ⋅ (1, 1, 0) = 2? (Answer on p. 1029.)

Page 370, Table of Contents

www.EconsPhDTutor.com

Part IV

Complex Numbers

Page 371, Table of Contents

www.EconsPhDTutor.com

36

Complex Numbers: Introduction

Here’s a brief motivation of complex numbers: 1. Solve x − 1 = 0. Easy; the answer is a natural number: x = 1.

2. To solve x + 1 = 0, we must invent negative numbers. The answer is x = −1. √ 3. To solve x2 = 2, we must invent irrational numbers. The answer is x = ± 2.

4. To solve x2 = −1, we must invent complex numbers.

We’ll start by defining the imaginary unit, then work our way to complex numbers. Definition 92. The imaginary unit, denoted i, is a number that satisfies i2 = −1. Using the imaginary unit, we can construct other purely imaginary numbers:

Definition 93. A purely imaginary number is any real, non-zero multiple of the imaginary unit. That is, a purely imaginary number is any bi, where b ∈ R with b ≠ 0. (We specify that b ≠ 0 because 0i = 0 is not a purely imaginary number, but a real number.)

√ √ √ Example 355. i + i = 2i = 2 −1 is purely imaginary. So too are −i = − −1 and πi = π −1. i is both the imaginary unit and a purely imaginary number. We can add real numbers to purely imaginary numbers to form imaginary numbers: Definition 94. An imaginary number is any a + bi, where a, b ∈ R with b ≠ 0.

Again, we specify that b ≠ 0 because otherwise a + 0i would not be an imaginary number, but a real number.

Example 356. 3 + 2i is imaginary, but not purely imaginary. In contrast, 2i is both imaginary and purely imaginary.

Page 372, Table of Contents

www.EconsPhDTutor.com

A complex number is simply any real number or imaginary number. Definition 95. A complex number is any a + bi, where a, b ∈ R.

Notice that here in contrast, we do not specify that b ≠ 0. The reason is that complex numbers include all real numbers.

Example 357. 10 and 17 are complex and real. 2+9i and 3−2i are complex and imaginary. 2i is complex, imaginary, and purely imaginary. i is complex , imaginary, purely imaginary, and also the imaginary unit. We denoted the set of real numbers by the symbol R. We now denote the set of complex numbers by the symbol C. Definition 96. The set of all complex numbers, denoted C, is defined as {a + bi∣a, b ∈ R}.

The set of reals is a proper subset of the set of complex numbers — formally, Fact 43. R ⊂ C.

Proof. Every element a ∈ R can be written as a + 0i and is thus an element of C. So R is a subset of C. Moreover R is not equal to C, because for example 3 + 7i ∈ C but 3 + 7i ∉ R.

Altogether then, R is a proper subset of C.

Complex numbers are thus the extension of the concept of real numbers. On the next page is a modified version of our taxonomy of numbers from p. 41, with the complex numbers fleshed out:

Page 373, Table of Contents

www.EconsPhDTutor.com

Complex Numbers

Real Numbers

“Impure” Imaginary Numbers

Imaginary Numbers

Purely Imaginary Numbers The Imaginary Unit

Note: there is no such thing as a positive or negative complex number. To fully appreciate why this is so is beyond the scope of the A-levels. But for now here is a very simple example just to illustrate the point. Example 358. 1, −1, i, and −i are all complex numbers. We do say that 1 is a positive real number and −1 is a negative real number. But we do not say that i is a positive complex number or that −i is a negative complex number.

In fact, we do not even say that 1 is a positive complex number or that −1 is a negative complex number. Exercise 141. Fill in the following table. The first column has been done for you. (Answer on p. 1030.) √ √ Is this ... 13 − 2i 3i 0 4 4 + 2i i 3 A complex number? Yes A real number? No An imaginary number? Yes A purely imaginary number? No No The imaginary unit?

Page 374, Table of Contents

www.EconsPhDTutor.com

36.1

The Real and Imaginary Parts of Complex Numbers

Definition 97. Given a complex number z = a + bi, its real part is a and is denoted Re(z). Similarly, its imaginary part is b and is denoted Im(z). Example 359. Re(3 + 2i) = 3 and Im(3 + 2i) = 2. Example 360. Re(7) = 7 and Im(7) = 0.

Example 361. Re(19i) = 0 and Im(19i) = 19.

It is also often convenient to write complex numbers in ordered pair notation, with the first term being the real part and the second term being the imaginary. Example 362. Given z = 3 + 2i, we can also write z = (3, 2). Example 363. Given z = 7, we can also write z = (7, 0).

Example 364. Given z = 19i, we can also write z = (0, 19).

Of course, two complex numbers z and w are equal if and only if (i) their real parts are equal; AND (ii) their imaginary parts are equal. Example 365. Suppose z = 3 + bi and w = a − 17i are equal. Then it must be that a = 3 and b = −17. Exercise 142. Exactly two of the following complex numbers are identical. Find out which two. (Answer on p. 1030.) √ 1 2 a= √ − i, 2 2

Page 375, Table of Contents

3 1 b = √ − √ i, 2 2

π π c = sin − sin i, 3 3

d=

√

3 π − cos (− ) i. 2 4

www.EconsPhDTutor.com

37

Basic Arithmetic of Complex Numbers

The familiar arithmetic operations work the same way on imaginary numbers as they do on real numbers. Addition and subtraction are especially simple.

37.1

Addition and Subtraction

Example 366. Let z = −2 + i and w = 3i. Then z + w = −2 + 4i and z − w = −2 − 2i. We can also write z = (−2, 1) and w = (0, 3), so that

z + w = (−2 + 0, 1 + 3) = (−2, 4) and z − w = (−2 − 0, 1 − 3) = (−2, −2).

Example 367. Let z = 7 − i and w = 2 + 5i. Then z + w = 9 + 4i and z − w = 5 − 6i.

We can also write z = (7, −1) and w = (2, 5), so that

z + w = (7 + 2, −1 + 5) = (7 + 2, −1 + 5) and z − w = (7 − 2, −1 − 5) = (5, −6).

In general, Fact 44. If z = (a, b) and w = (c, d), then

z + w = (a + c, b + d) and z − w = (a − c, b − d).

Exercise 143. For each of the following, compute z + w and z − w. (Answer on p. 1031.) √ (a) z = −5 + 2i, w = 7 + 3i. (b) z = 3 − i, w = 11 + 2i. (c) z = 1 + 2i, w = 3 − 2i.

Page 376, Table of Contents

www.EconsPhDTutor.com

37.2

Multiplication

Below are listed the powers of i. Note that the cycle repeats after every fourth power, because i4 = 1. i = i,

i5 = i × i4 = i,

i9 = i × i8 = i,

i2 = i × i = −1,

i3 = i × i2 = −i,

i6 = i × i = −1,

i10 = i × i = −1,

i7 = i × i2 = −i,

etc.

i11 = i × i2 = −i,

i4 = i × i3 = 1,

i8 = i × i3 = 1,

i12 = i × i3 = 1,

Example 368. Let z = i and w = 1 + i. Then zw = i × (1 + i) = i × 1 + i × i = i − 1. Example 369. Let z = −2 + i and w = 3i. Then

zw = (−2 + i) × (3i) = (−2) × (3i) + i × (3i) = −6i + 3i2 = −6i + 3(−1) = −3 − 6i.

Example 370. Let z = 2 − i and w = −1 + i. Then

zw = (2 − i)(−1 + i) = −2 + 2i + i − i2 = −2 + 3i − i2 = −2 + 3i − (−1) = −1 + 3i.

Example 371. Let z = 3 + 2i and w = −7 + 4i. Then

zw = (3 + 2i)(−7 + 4i) = −21 + 12i − 14i + (2i)(4i) = −21 − 2i + 8i2 = −21 − 2i + 8 (−1) = −29 − 2i.

Page 377, Table of Contents

www.EconsPhDTutor.com

In general, Fact 45. If z = (a, b) and w = (c, d), then

zw = (ac − bd, ad + bc) .

Exercise 144. Prove Fact 45. (Answer on p. 1031.) Exercise 145. For each of the following, compute zw. (Answer on p. 1031.) (a) z = −5 + 2i, w = 7 + 3i.

(b) z = 3 − i, w = 11 + 2i.

(c) z = 1 + 2i, w = 3 −

√ 2i.

Exercise 146. Given that z = 2 + i and az 3 + bz 2 + 3z − 1 = 0, find a and b. (Answer on p. 1031.)

Page 378, Table of Contents

www.EconsPhDTutor.com

37.3

Division

Recall that to rationalist a surd in the denominator (section 5.2), we used a trick involving conjugate pairs. √ √ √ 3 3 1 − 5 3 (1 − 5) 3 ( 5 − 1) √ = √ × √ = Example 372. = . 1−5 4 1+ 5 1+ 5 1− 5

We called a + b and a − b a conjugate pair because (a + b)(a − b) = a2 − b2 . If b is the square root of some number, then this is a rationalization (“make rational”) that helps get rid of an ugly surd. Now, given z = a + bi, we call z ∗ = a − bi its conjugate. And we call a + bi and a − bi a conjugate pair, because (a + bi)(a − bi) = a2 − (bi) 2 = a2 − b2 i2 = a2 + b2 .

This is a realization (“make real”) that helps get rid of any complex numbers. Example: Example 373. (1 + i)∗ = 1 − i, i∗ = −i, and (1 − i)∗ = 1 + i. Thus: (a)

(b) (c) In general,

1 1 1−i 1−i 1−i = × = 2 2= = 0.5 − 0.5i. 1+i 1+i 1−i 1 −i 1+1 1 1 −i −i −i = × = = = −i. i i −i −i2 1

1 1+i 1+i 1+i 1 = × = 2 2= = 0.5 + 0.5i. 1−i 1−i 1+i 1 −i 1+1

Fact 46. If z = (a, b), then z ∗ = (a, −b),

zz ∗ = a2 + b2 ,

and

1 1 z∗ z∗ a −b = × ∗ = 2 2 = ( 2 2, 2 2). z z z a +b a +b a +b

Exercise 147. For each of the following z, write down its conjugate z ∗ and hence compute its reciprocal (i.e. 1/z). (a) z = −5 + 2i. (b) z = 3 − i. (c) z = 1 + 2i. (Answer on p. 1031.) Page 379, Table of Contents

www.EconsPhDTutor.com

We now divide one complex number by another. Example 374. (a) (b) (c) (d) (e) (f)

−2 + i −2 + i −3i 6i − 3i2 6i + 3 = × = = . 3i 3i −3i −9i2 9

3 + i 3 + i 1 + i (3 + i)(1 + i) 3 + 3i + i + i2 2 + 4i = × = = = = 1 + 2i. 1−i 1−i 1+i 12 − i2 1+1 2

1+i 1 + i 3 + 2i 3 + 2i + 3i + 2i2 1 + 5i = × = = . 3 − 2i 3 − 2i 3 + 2i 9+4 13

2 − i −1 − i −2 − 2i + i + i2 −3 − i 2−i = × = = = −1.5 − 0.5i. −1 + i −1 + i −1 − i 1+1 2

3 + 2i 3 + 2i −7 − 4i −21 − 12i − 14i − 8i2 −13 − 26i = × = = = −0.2 − 0.4i. −7 + 4i −7 + 4i −7 − 4i 49 + 16 65

−3 + 6i −3 + 6i 2 − πi −6 + 3πi + 12i − 6πi2 6π − 6 + (3π + 12)i . = × = = 2 + πi 2 + πi 2 − πi 22 − π 2 i 2 4 + π2

In general, Fact 47. If z = (a, b) and w = (c, d) with w ≠ 0, then

z z w∗ zw∗ ac + bd bc − ad = × ∗= 2 = ( , ). w w w c + d2 c2 + d2 c2 + d2

Exercise 148. Rewrite each of the following fractions into the form a + bi. (Answer on p. 1032.) √ 1 + 3i 2 − 3i 2 − πi 11 + 2i −3 7 − 2i √ . (d) (a) . (b) . (c) . (e) . (f) . −i 1+i i 2+i 5+i 3 − 2i

Page 380, Table of Contents

www.EconsPhDTutor.com

38 38.1

Solving Polynomial Equations Complex Roots to Quadratic Equations

In section 14 (quadratic equations review), we saw that if ax2 + bx + c = 0 has non-negative discriminant (i.e. b2 − 4ac ≥ 0), then its real roots are given by x=

−b ±

√

b2 − 4ac . 2a

Example 375. Consider the equation x2 − 3x + 2 = 0. Its discriminant is positive: b2 − 4ac = (−3)2 − 4(1)(2) = 1 > 0. Hence, it has two real roots, given by x=

−b ±

√ √ b2 − 4ac 3 ± 1 = = 1, 2. 2a 2

Now, armed with our new concept of imaginary numbers, we can completely dispense with the requirement that b2 − 4ac ≥ 0. We can simply say that ax2 + bx + c = 0 ALWAYS has complex roots, given by x=

−b ±

√

b2 − 4ac . 2a

Example 376. Consider the equation x2 −2x+2 = 0. Its discriminant is negative: b2 −4ac = (−2)2 − 4(1)(2) = −4 < 0. It has two imaginary (and thus also complex) roots, given by x=

−b ±

√

√ √ √ b2 − 4ac 2 ± −4 4 × −1 2i = =1± = 1 ± = 1 ± i. 2a 2 2 2

Notice that 1 + i was a root to the given quadratic equation. And interestingly enough, so too was 1 − i.

It turns out that in general, a quadratic equation with real coefficients has roots that come in conjugate pairs. That is, if x + yi is a root, then so too is its conjugate x − yi.42 More examples: 42

√ This is not terribly surprising if you examine the general solution for the quadratic equation — the ± b2 − 4ac bit corresponds precisely to the imaginary part.

Page 381, Table of Contents

www.EconsPhDTutor.com

Example 377. Consider the equation 3x2 +x+1 = 0. Its discriminant is negative: b2 −4ac = (1)2 − 4(3)(1) = −11 < 0. It has two imaginary (and thus also complex) roots, given by x=

−b ±

√ √ √ √ √ b2 − 4ac −1 ± −11 11 × −1 11 1 1 = =− ± =− ± i. 2a 6 6 6 6 6

Example 378. If 3 + 2i is a root to the quadratic equation x2 + bx + c = 0 (where b and c are both real), then what are b and c? Well, we know that 3 − 2i is also a root to the equation. And so x2 + bx + c = [x − (3 + 2i)] [x − (3 − 2i)] = (x − 3)2 − (2i)2 = x2 − 6x + 13.

Hence, b = 6 and c = 13.

Exercise 149. Find the roots for each of the following quadratic equations. (Answer on p. 1033.) (a) x2 + x + 1 = 0.

(b) x2 + 2x + 2 = 0.

(c) 3x2 + 3x + 1 = 0.

Exercise 150. If 1 − i is a root to the quadratic equation x2 + bx + c = 0 (where b and c are both real), then what are b and c? (Answer on p. 1033.)

Page 382, Table of Contents

www.EconsPhDTutor.com

38.2

The Fundamental Theorem of Algebra

Recall from p. 187 that a polynomial of degree n in one variable is any expression a0 xn + a1 xn−1 + a2 xn−2 + ⋅ ⋅ ⋅ + an−1 x + an where each ai is a constant and x is the variable.

Theorem 4. The Fundamental Theorem of Algebra. A polynomial of degree n in one variable has exactly n zeros (though some may be repeated). That is, there are exactly n (possibly repeated) solutions to the equation a0 xn + a1 xn−1 + a2 xn−2 + ⋅ ⋅ ⋅ + an−1 x + an = 0. Proof. The proof of this theorem is way too advanced and so omitted from this book.43

Example 379. x2 − 1 is a polynomial of degree 2. And indeed, x2 − 1 = 0 has two solutions, namely 1 and −1. Example 380. x2 + 1 is a polynomial of degree 2. And indeed, x2 + 1 = 0 has two solutions, namely i and −i.

There are sometimes “repeated solutions” or what are more formally called multiple roots, as the next example illustrates.

Example 381. x2 − 2x + 1 = 0 has two (repeated) solutions, namely 1 and 1. We call 1 a multiple root (indeed a double root). Example 382. x3 − 6x2 + 12x − 8 = 0 has three (repeated) solutions, namely 2, 2, and 2. We call 2 a multiple root (indeed a triple root).

43

But see this MathOverflow Q&A if you’re interested.

Page 383, Table of Contents

www.EconsPhDTutor.com

The Fundamental Theorem of Algebra can be useful even if we have no idea how to find the solutions to an equation. Example 383. x17 + 3x4 − 2x + 1 is a polynomial of degree 17. I may not know what the solutions to x17 + 3x4 − 2x + 1 = 0 are, but I know from the Fundamental Theorem of Algebra that there MUST be 17 solutions (though some may possibly be repeated).

Example 384. x4 + x3 − 5x2 + x − 6 is a polynomial of degree 4 and so it must have four zeros. Suppose we are given as a hint that two of them are i and −i. Then how would we go about finding the other two zeros?

The problem of finding the zeros of a polynomial is really the same as the problem of factorizing a polynomial. This is because a is a zero of a polynomial if and only if (x − a) is a factor of the polynomial. So (x − i) and (x + i) are factors for the polynomial. Now, (x − i)(x + i) = x2 − i2 = x2 + 1. So find (x4 + x3 − 5x2 + x − 6) ÷ (x2 + 1) through long division: x2 +x −6 x2 + 1 x4 +x3 −5x2 x4 +0 +x2 x3 −6x2 x3 +0 −6x2 −6x2

+x −6 +x +0 −6 +0 −6 0.

Hence, (x4 + x3 − 5x2 + x − 6) = (x2 + 1) (x2 + x − 6). By observation, x2 + x − 6 = (x − 2)(x + 3). Hence,

(x4 + x3 − 5x2 + x − 6) = (x2 + 1) (x2 + x − 6) = (x − i)(x + i)(x − 2)(x + 3).

Altogether, the four zeros of the given polynomial are ±i, 2, and −3.

Page 384, Table of Contents

www.EconsPhDTutor.com

Example 385. x3 − 3x2 − 5x − 25 is a polynomial of degree 3 and so it must have three zeros. As a hint, we are told that one of them is 5. What are the other two? x2 +2x +5 3 2 x − 5 x −3x −5x −25 x3 −5x2 2x2 2x2 −10x 5x −25 5x −25 0.

So x3 − 3x2 − 5x − 25i = (x − 3) (x2 + 2x + 5). I’m unable to easily see how x2 + 2x + 5 can be factorized. So let me just use the quadratic formula: x=

−2 ±

Altogether then,

√

√ √ 22 − 4(1)(5) = −1 ± 1 − 5 = −1 ± −4 = −1 ± 2i. 2

x3 − 3x2 − 5x − 25i = (x − 5) (x2 + 2x + 5) = (x − 5) [x − (−1 + 2i)] [x − (−1 − 2i)] .

So the three zeros of the polynomial are 5 and −1 ± 2i.

Exercise 151. Each of the following polynomials has 1 as a zero. Find the other zeros. (Answer on p. 1034.) (a) x3 + x2 − 2.

Page 385, Table of Contents

(b) x4 − x2 − 2x + 2.

www.EconsPhDTutor.com

38.3

The Complex Conjugate Roots Theorem

We saw above that if c + di is a root to a quadratic equation ax2 + bx + c = 0 (where a, b, and c are real), then so too is its conjugate c − di.

What is perhaps surprising is that this generalities to the case of any polynomial, provided that all coefficients of the polynomial are real. Example 386. If told that 2 − i solves x3 − x2 − 7x + 15 = 0, we know immediately that its conjugate 2 + i also solves the same equation.

Example 387. If told that i solves 4x4 +5x2 +1 = 0, we know immediately that its conjugate −i also solves the same equation. Similarly, if told also that 0.5i solves the same equation, we know immediately that its conjugate −0.5i also solves the same equation. Theorem 5. (Complex Conjugate Roots Theorem.) Let a0 , a1 , . . . , ak be real. If a + bi solves an xn + an−1 xn−1 + an−2 xn−2 + ⋅ ⋅ ⋅ + a1 x + a0 = 0, then so does a − bi. Proof. Optional, see p. 870 in Appendices. The condition that all coefficients ak are real is important. The above theorem does not apply if any of the coefficients are imaginary. Example 388. i solves x2 + ix + 2 = 0.

However, its conjugate −i does not (as you should verify yourself).

√ √ 2/2 + i 2/2 solves x2 = i. √ √ However, its conjugate 2/2 − i 2/2 does not (as you should verify yourself).

Example 389.

Example 390. 2 + i solves x3 − (i + 2)x2 + 2x − 2(2 + i) = 0.

However, its conjugate 2 − i does not (as you should verify yourself).

Exercise 152. Each of the following polynomials has 2 − 3i as a zero. Find the other zeros. (Answer on p. 1035.) (a) x4 − 6x3 + 18x2 − 14x − 39.

Page 386, Table of Contents

(b) −2x4 + 21x3 − 93x2 + 229x − 195.

www.EconsPhDTutor.com

39

The Argand Diagram

The complex plane (or Argand diagram) gives us a nice geometric interpretation: The complex numbers are simply points on the plane. The real axis is the horizontal or x-axis. The imaginary axis is the vertical or y-axis. Example 391. In the figure below, marked in red are the real numbers −3, 0, π, and 2, which may be written in ordered pair notation as (−3, 0), (0, 0), (π, 0), (2, 0). Points on the horizontal axis are real numbers.

In blue are the purely imaginary numbers −4i and 3i, which may be written in ordered pair notation as (0, −4) and (0, 3). Points on the vertical axis are purely imaginary numbers.

In green are the “impure” or “mixed” imaginary numbers 1 + i, −3 + 2i, 1 − 3i, and −4 − i, which may be written in ordered pair notation as (1, 1), (−3, 2), (1, −3), and (−4, −1). Points not on either axis are “impure” or “mixed” imaginary numbers.

5

y

4 3 2 1 x 0 -5

-4

-3

-2

-1

0

1

2

3

4

5

-1 -2 -3 -4 -5

Page 387, Table of Contents

www.EconsPhDTutor.com

For our purposes, we’ll regard the complex plane C as being exactly identical to the cartesian plane {(x, y) ∶ x ∈ R, y ∈ R}. Both are represented graphically as a two-dimensional plane. The only difference is that we interpret points on each plane differently: Points on the complex plane are complex numbers, while points on the cartesian plane are ordered pairs of real numbers.44 Exercise 153. Illustrate the complex numbers 1, −3, 2i, 1 + 2i, and −1 − 3i on a single Argand diagram. (Answer on p. 1036.)

44

The differences between C and R2 in fact run deeper. See e.g. this discussion..

Page 388, Table of Contents

www.EconsPhDTutor.com

39.1

Complex Numbers in Polar Form

To write a complex number in standard form — i.e. z = x + iy, we need only two pieces of information: its real part (x) and its imaginary part (y).

We now write a complex number in polar form. Again, we need only two pieces of information: the modulus, denoted ∣z∣, and the argument, denoted arg z. Informally, the modulus is the length of the position vector of z; the argument is the angle the position vector of z makes with the positive x-axis.

Example 392. The complex number −3 = (−3, 0) has modulus ∣ − 3∣ = 3 and argument arg 3 = π. The complex number −4i = (0, −4) has modulus ∣−4i∣ = 4 and argument arg√ (−4i) = √ 2 2 −π/2. The complex number 3 + 3i = (3, 3), has modulus ∣3 + 3i∣ = 3 + 3 = 3 2 and argument arg(3 + 3i) = π/4.

5

y

4 3 + 3i = (3, 3) 3 2 1 -3 = (-3, 0)

x 0

-5

-4

-3

-2

-1

0

1

2

3

4

5

-1 -2 -3 -4

-4i = (0, -4)

-5

Page 389, Table of Contents

www.EconsPhDTutor.com

The formal definition of the modulus function is simple. Definition 98. The modulus function has domain C, codomain R, and mapping rule z ↦ √ 2 2 x + y . The modulus of z is denoted ∣z∣. In contrast, it is tricky to write down a formal definition of the argument function. One problem is this: Angles are periodic. Example 393. Consider again the complex number 3 + 3i = (3, 3). The angle it makes with the positive x-axis is π/4.

But angles are periodic. Equivalently, angles come full circle 2π radiant. So it would make just as much sense to say that the angle is 9π/4. Or 17π/4. Or −7π/4. Or indeed any π/4 + 2kπ, where k is any integer. To overcome this problem, we shall somewhat arbitrarily choose (−π, π] as our principal values. Thus, arg(3 + 3i) shall be uniquely defined to be the value π/4 and nothing else.

Another problem is this: We are tempted to simply define arg(x + yi) = tan−1 (y/x). Unfortunately, the tan−1 function has codomain (−π/2, π/2). Whereas, as we just decided, arg should have codomain (−π, π]. To overcome this, altogether, the argument function is defined as follows: Definition 99. The argument function has domain C, codomain (−π, π], and mapping rule as given below: ⎧ ⎪ ⎪ tan−1 (y/x) , ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ Undefined, ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪π/2, arg z = ⎨ ⎪ ⎪ −π/2, ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ tan−1 (y/x) + π, ⎪ ⎪ ⎪ ⎪ ⎪ −1 ⎪ ⎪ ⎩tan (y/x) − π,

if x > 0 (top-right and bottom-right quadrants),

if x = 0 = y (the origin),

if x = 0, y > 0 (the positive y − axis),

if x = 0, y < 0, (the negative y − axis)

if x < 0, y ≥ 0 (top-left quadrant, including the negative x-axis), if x < 0, y < 0 (bottom-left quadrant).

The argument of z is denoted arg z. y (My mnemonic for the above: “arg z = tan−1 . Top left +π. Bottom left−π.”) x

We now illustrate and explain the above definition: Page 390, Table of Contents

www.EconsPhDTutor.com

y arg z =

arg z = x arg z =

• If x > 0 (top-right and bottom-right quadrants), then define arg(x + yi) = tan−1 (y/x).

The green point in the figure above illustrates. The angle that the position vector of the green (x, y) makes with the positive x-axis is indeed simply tan−1 (y/x).

• If x = 0, y = 0 (the origin), then arg(x + yi) is undefined. In other words, we leave arg 0 undefined.45 • If x = 0, y > 0 (positive vertical axis), then define arg(x + yi) = arg(yi) = π/2. • If x = 0, y < 0 (negative vertical axis), then define arg(x + yi) = arg(yi) = −π/2. • If x < 0, y ≥ 0 (top-left quadrant plus the negative horizontal axis), then define arg(x + yi) = tan−1 (y/x) + π.

The red point illustrates. The angle its position vector makes with the negative x-axis is tan−1 (y/∣x∣). And so arg(x + yi) = π − tan−1 (y/∣x∣). Observe that tan−1 (y/∣x∣) = tan−1 (y/ − x) = − tan−1 (y/x). Thus, arg(x + yi) = π − tan−1 (y/∣x∣) = tan−1 (y/x) + π. • If x < 0, y < 0 (bottom-left quadrant), then define arg(x + yi) = tan−1 (y/x) − π.

The blue point illustrates. The angle its position vector makes with the negative x-axis is tan−1 (∣y∣/∣x∣). And so arg(x + yi) = tan−1 (∣y∣/∣x∣) − π. Observe that (∣y∣/∣x∣) = (−y/ − x) = (y/x). Thus, arg(x + yi) = π − tan−1 (∣y∣/∣x∣) = tan−1 (y/x) − π.

45

Some writers define arg 0 = 0, but we shall not do this.

Page 391, Table of Contents

www.EconsPhDTutor.com

The following fact is sometimes useful. Fact 48. (a) z is purely imaginary (z is on the vertical axis) ⇐⇒ arg z = ±π/2. (a) z is real (z is on the horizontal axis) ⇐⇒ arg z = 0, π. Proof. Immediate from the definition of the arg function.

Exercise 154. (Answer on p. 1037.) Compute the modulus and argument of 4, −3, 2i, 1 + 2i, and −1 − 3i. Illustrate these numbers and their arguments on a single Argand diagram. Exercise 155. (Answer on p. 1038.) Where on the complex plane must a complex number be, if its argument is ... (a) Positive?

(b) Negative?

Page 392, Table of Contents

(c) 0?

(d)

π ? 2

π (e) − ? 2

(f) >

π ? 2

π (g) < − ? 2

www.EconsPhDTutor.com

Armed Armedwith withthe themodulus modulusand andthe theargument, argument,we wehave haveaanice nicegeometric geometricinterpretation: interpretation: Fact Fact49. 49.Let Letz zbebea acomplex complexnumber numberwith with∣z∣∣z∣==rrand andarg argzz==θ.θ. Then Then zz==rr(cos (cosθθ++iisin sinθ). θ). We Wecall callr (cos r (cosθ θ++i sin i sinθ)θ)the thepolar polarform formrepresentation representationofofz.z.

Example394. 394.Given Givenz z==5 5−−2i2i==(5, (5,−2), −2),we wehave have Example

√√ √ 2 2 =√ 29 2 (−2) ∣z∣∣z∣= = 525++(−2) = 29

−2 −0.381. argzz==tan tan−1−1−2 ≈≈−0.381. arg 55

and and

√√ So in polar form, z = [cos(−0.381) (−0.381)++i isin sin(−0.381)]. (−0.381)]. So in polar form, z = 2929[cos

Example395. 395.For Forz z= =1 1++3i3i==(1, (1,3), 3),we wehave have Example

√√ 2 2 √√ ∣z∣ = ∣z∣ = 121 ++323 == 1010

−133 −1 arg z tan 1.249. arg z tan 1≈≈1.249. 1

and and

√√ So in polar form, z = [cos(1.249) (1.249)++i isin sin(1.249)]. (1.249)]. So in polar form, z = 1010[cos Example396. 396.For Forz z= =−4 −4++7i7i==(−4, (−4,7), 7),we wehave have Example √ √√ 2 + 272 =√ 65 ∣z∣ = (−4) 2 ∣z∣ = (−4) + 7 = 65

and and

7 argzz==tan tan−1−1 7 ++ππ≈≈2.090. 2.090. arg −4 −4

√√65 [cos (2.090) + i sin (2.090)]. So in polar form, z = So in polar form, z = 65 [cos (2.090) + i sin (2.090)].

Exercise156. 156.Rewrite Rewriteeach eachofofthe thefollowing followingcomplex complexnumbers numbersininpolar polarform: form:1, 1, Exercise −3,−3, 2i,2i, 1 +2i,2i,and and−1−1−−3i.3i.(Answer (Answeron onp.p.1038.) 1038.) 1+

Page 393, TableofofContents Contents Page 393, Table

www.EconsPhDTutor.com www.EconsPhDTutor.com

39.2

Complex Numbers in Exponential Form

Theorem 6. Euler Formula. For any θ ∈ R, eiθ = cos θ + i sin θ. Proof. Optional, see p. 870 in the Appendices. Fact 50. Let z be a complex number with ∣z∣ = r and arg z = θ. Then z = reiθ . Proof. ∣z∣ = r and arg z = θ Ô⇒ z = r(cos θ + i sin θ) ⇐⇒ z = reiθ , where ⇐⇒ uses Fact 2 49 and ⇐⇒ uses the Euler Formula. 1

2

1

We call reiθ the exponential form representation of z.

Euler’s identity is one of the most extraordinary and beautiful equations in all of mathematics. It links together five fundamental mathematical constants: e, i, π, 1, and 0. Corollary 4. (Euler’s identity.) eiπ + 1 = 0.

Proof. By Theorem 6, eiπ = cos π + i sin π = −1 + 0 = −1. Hence, eiπ + 1 = 0.

√ √ 2 Example 397. The number z = 5 − 2i = (5, −2) has modulus 52 + (−2) = 29 and √ argument tan−1 (−2/5) ≈ −0.381. Hence, we can also write z = 29ei(−0.381) .

√ √ 2 + 32 = Example 398. The number z = 1 + 3i = (1, 3) has modulus 1 10 and argument √ i(1.249) −1 tan (3/1) ≈ 1.249. Hence, we can also write z = 10e . √ √ Example 399. The number z = −4 + 7i = (−4, 7) has modulus√ (−4)2 + 72 = 65 and argument tan−1 (7/ − 4) + π ≈ 2.090. Hence, we can also write z = 65ei(2.090) . Exercise 157. Rewrite each of the following complex numbers in exponential form: 1, −3, 2i, 1 + 2i, and −1 − 3i. (Answer on p. 1038.) Page 394, Table of Contents

www.EconsPhDTutor.com

40

More Arithmetic of Complex Numbers

Now that we know how to write complex numbers in polar and exponential forms, the arithmetic of complex numbers becomes even easier.

40.1

The Product of Two Complex Numbers

Fact 51. Product of two complex numbers. Let z and w be complex numbers. Then ∣zw∣ = ∣z∣ ∣w∣ ,

and

arg (zw) = arg z + arg w + 2kπ,

(where k = −1, 0, 1 ensures that arg z + arg w + 2kπ ∈ (−π, π]).

Proof. Let z = r(cos θ + i sin θ) and w = s(cos φ + i sin φ). Then

zw = rs(cos θ + i sin θ)(cos φ + i sin φ) = rs(cos θ cos φ + i sin θ cos φ + i cos θ sin φ − sin θ sin φ) = rs [cos (θ + φ) + i sin (θ + φ)] .

This is the complex number with modulus rs and which makes an angle θ + φ with the positive x-axis.

Note though that θ + φ may not be in (−π, π]. Thus, rather than say that arg(zw) = arg z + arg w, we instead say that arg (zw) = arg z + arg w + 2kπ (where k = −1, 0, 1 ensures that arg z + arg w + 2kπ ∈ (−π, π]). Here is an alternative quicker proof of the above fact, using the exponential form. Proof. Let z = reiθ and w = seiφ . Then zw = rsei(θ+φ) . This is the complex number with modulus rs and which makes an angle θ + φ with the positive x-axis.

Page 395, Table of Contents

www.EconsPhDTutor.com

Example 400. Let z = 5 − 2i and w = 1 + 3i. Then

Ô⇒

√ √ 2 ∣z∣ = 52 + (−2) = 29, and arg z = tan−1 (−2/5), √ √ and arg w = tan−1 (3/1), ∣w∣ = 12 + 32 = 10,

√ √ √ ∣zw∣ = 29 × 10 = 290, and arg (zw) = tan−1 (−2/5) + tan−1 (3/1) + 2kπ ≈ 0.869 + 2kπ = 0.869 (k = 0).

Notice that here arg z + arg w ≈ 0.869 ∈ (−π, π]. So arg z + arg w is already a principal value and we can simply set k = 0 or arg(zw) = arg z + arg w. √ √ So zw ≈ 290 (cos 0.869 + i sin 0.869) = 290ei(0.869) .

To get zw in standard form, use a calculator: You’ll get √ 3 −2 + tan−1 ] = 11, 290 cos [tan−1 5 1

and

√ 3 −2 + tan−1 ] = 13. 290 sin [tan−1 5 1

And indeed zw = (5 − 2i)(1 + 3i) = 11 + 13i.

Page 396, Table of Contents

www.EconsPhDTutor.com

Example 401. Let z = −4 + 7i and w = 1 − 6i. Then ∣z∣ =

Ô⇒

∣w∣ =

√ √

(−4)2 + 72 = 12 + (−6)2 =

√ √

65, and

arg z = tan−1 [7/ (−4)] + π,

37, and arg w = tan−1 (−6/1),

√ √ √ ∣zw∣ = 65 × 37 = 2405, and arg (zw) = tan−1 [7/ (−4)] + π + tan−1 (−6/1) + 2kπ ≈ 0.684 + 2kπ = 0.684 (k = 0).

Notice that here arg z + arg w ≈ 0.684 ∈ (−π, π]. So arg z + arg w is already a principal value and we can simply set k = 0 or arg(zw) = arg z + arg w. √ √ So zw ≈ 2405 (cos 0.684 + i sin 0.684) = 2405ei(0.684) .

To get zw in standard form, use a calculator: You’ll get √ −7 −6 2405 cos [tan−1 + π + tan−1 ] = 38, 4 1

and

√ −7 −6 2405 sin [tan−1 + π + tan−1 ] = 31. 4 1

And indeed zw = (−4 + 7i)(1 − 6i) = 38 + 31i.

Page 397, Table of Contents

www.EconsPhDTutor.com

Example 402. Let z = −3 + 4i and w = −5 + 2i. Then ∣z∣ =

√

(−3)2 + 42 = 5, and arg z = tan−1 [4/ (−3)] + π, √ √ ∣w∣ = (−5)2 + 22 = 29, and arg w = tan−1 [2/ (−5)] + π.

Ô⇒

√ ∣zw∣ = 5 29,

arg (zw) = (tan−1

and 4 2 + π) + (tan−1 + π) + 2kπ ≈ 4.975 + 2kπ = −1.308 (k = −1). −3 −5

Notice that here arg z + arg w ≈ 4.975 ∉ (−π, π]. So we need to set k = −1 to get arg(zw) = arg z + arg w − 2kπ ≈ −1.308 ∈ (−π, π], so that arg(zw) is indeed a principal value. √ √ So zw ≈ 5 29 [cos (−1.308) + i sin (−1.308)] = 5 29ei(−1.308) .

To get zw in standard form, use a calculator: You use a calculator, you’ll get

√ √ −2 −2 −4 −4 + π + tan−1 ] = 7 and 5 29 × sin [tan−1 + π + tan−1 ] ≈ −26. 5 29 × cos [tan−1 3 5 3 5 And indeed zw = (−3 + 4i)(−5 + 2i) = 7 − 26i. Exercise 158. Write down zw in polar and exponential forms, for each of the following pair of z and w. (a) z = 1, w = −3. (b) z = 2i, w = 1 + 2i. (c) z = −1 − 3i, w = 3 + 4i. (Answer on p. 1039.)

Page 398, Table of Contents

www.EconsPhDTutor.com

40.2

The Ratio of Two Complex Numbers

Fact 52. Ratio of two complex numbers. Let z and w be complex numbers. Then z ∣z∣ ∣ ∣= , w ∣w∣

and

arg

z = arg z − arg w + 2kπ, w

(where k = −1, 0, 1 ensures that arg z + arg w + 2kπ ∈ (−π, π]).

Proof. Let z = r(cos θ + i sin θ) and w = s(cos φ + i sin φ). Then

z r(cos θ + i sin θ) r cos θ + i sin θ cos φ − i sin φ = = × w s(cos φ + i sin φ) s cos φ + i sin φ cos φ − i sin φ =

r cos θ cos φ + sin θ sin φ + i sin θ cos φ − i cos θ sin φ s cos2 φ + sin2 φ

=

r [cos (θ − φ) + i sin (θ − φ)] . s

=

r cos θ cos φ + sin θ sin φ + i sin θ cos φ − i cos θ sin φ s 1

This is the complex number with modulus r/s and argument θ − φ + 2kπ (where k is the unique integer such that θ − φ + 2kπ ∈ (−π, π]). Here is an alternative quicker proof of the above fact, using the exponential form. Proof. Let z = reiθ and w = seiφ . Then z/w = ei(θ−φ) (r/s). This is the complex number with modulus r/s and argument θ − φ + 2kπ (where k is the unique integer such that θ − φ + 2kπ ∈ (−π, π]). I’ll recycle the examples used in the previous section.

Page 399, Table of Contents

www.EconsPhDTutor.com

Example 403. Let z = 5 − 2i and w = 1 + 3i. Then

Ô⇒

∣z∣ =

√ 29,

√ z ∣ ∣ = 2.9, w

Ô⇒

zw ≈

arg z = tan−1

−2 , 5

∣w∣ =

√

z −2 3 arg ( ) = tan−1 − tan−1 + 2kπ ≈ −1.630 (k = 0). w 5 1

√ √ 2.9 [cos (−1.630) + i sin (−1.630)] = 2.9ei(−1.630) .

Example 404. Let z = −4 + 7i and w = 1 − 6i. Then

Ô⇒

z ∣ ∣= w

∣z∣ = √

√

65,

arg z = tan−1

7 + π, −4

∣w∣ =

√ 37,

arg w = tan−1

−6 . 1

65 z 7 −6 , arg ( ) = tan−1 +π−tan−1 +2kπ ≈ 3.496+2kπ ≈ −2.788 (k = −1). 37 w −4 1

Ô⇒

z ≈ w

√

65 [cos (−2.788) + i sin (−2.788)] = 37

Example 405. Let z = −3 + 4i and w = −5 + 2i. Then

Ô⇒

3 arg w = tan−1 . 1

10,

∣z∣ = 5,

z 5 ∣ ∣= √ , w 29 Ô⇒

arg z = tan−1

4 + π, −3

∣w∣ =

√ 29,

√

65 i(−2.788) e . 37

arg w = tan−1

2 + π. −5

z 4 2 arg ( ) = tan−1 − tan−1 + 2kπ ≈ −0.547 + 2kπ = −0.547 (k = 0). w −3 −5 z 5 5 ≈ √ [cos (−0.547) + i sin (−0.547)] = √ ei(−0.547) . w 29 29

Exercise 159. For each of the following pairs of z and w, write down z/w in polar and exponential forms. (Answer on p. 1040.) (a) z = 1, w = −3.

Page 400, Table of Contents

(b) z = 2i, w = 1 + 2i.

(c) z = −1 − 3i, w = 3 + 4i.

www.EconsPhDTutor.com

40.3

Sine and Cosine as Weighted Sums of the Exponential

Fact 53 expresses the sine and cosine functions as weighted sums of the exponential functions. It is not in the syllabus, but made a sudden first-time appearance on the 2015 A-level exams (Exercise 357), just to screw students over. Fact 53. cos θ =

eiθ + e−iθ eiθ − e−iθ and sin θ = . 2 2i

Proof. By the Euler Formula, eiθ = cos θ+i sin θ. Moreover, e−iθ = cos (−θ)+i sin (−θ) = cos θ− i sin θ, where the second equality uses the properties cos x = cos(−x) and sin(−x) = − sin x. Hence,

eiθ + e−iθ cos θ + i sin θ + cos θ − i sin θ = = cos θ, as desired. 2 2

eiθ − e−iθ cos θ + i sin θ − cos θ + i sin θ = = sin θ, also as desired. Similarly, 2i 2 The 2015 question was about the sum (or difference) of two complex numbers that have the same modulus. Here’s a similar example:

Page 401, Table of Contents

www.EconsPhDTutor.com

Example 406. Let z = 5eiπ and w = 5e0.4iπ . What, exactly, are the modulus and arguments of z + w and z − w?

Without the above fact, it’s not obvious. With the above fact, it’s easy. First, observe that 0.7 is the average of 1 and 0.4. Then factorise 5eiπ + 5e0.4iπ into a form where we can exploit the above fact. Like so: z + w = 5eiπ + 5e0.4iπ = 5e0.7iπ (e0.3iπ + e−0.3iπ ) = 5e0.7iπ × 2 cos(0.3π),

where the last = uses Fact 53. And thus:

arg (z + w) = arg [5e0.7iπ × 2 cos(0.3π)]

= arg 5 + arg (e0.7iπ ) + arg 2 + arg [cos(0.3π)] + 2kπ = 0 + 0.7π + 0 + 0 + 2kπ = 0.7π (k = 0).

∣z + w∣ = ∣5e0.7iπ × 2 cos(0.3π)∣ = ∣5∣ ∣e0.7iπ ∣ ∣2∣ ∣cos(0.3π)∣ = 5 × 1 × 2 × cos(0.3π) = 10 cos(0.3π).

Altogether then, z + w = 10 cos(0.3π)ei(0.7π) .

We can play a similar trick to figure out the modulus and argument of z − w:

z − w = 5eiπ − 5e0.4iπ = 5e0.7iπ (e0.3iπ − e−0.3iπ ) = 5e0.7iπ × 2i sin(0.3π).

where again the last = uses Fact 53. And thus:

arg (z − w) = arg [5e0.7iπ × 2i sin(0.3π)]

= arg 5 + arg (e0.7iπ ) + arg 2 + arg i + arg [sin(0.3π)] + 2kπ = 0 + 0.7π + 0 + 0.5π + 0 + 2kπ = −0.8π (k = −1),

∣z − w∣ = ∣5e0.7iπ × 2i sin(0.3π)∣ = ∣5∣ ∣e0.7iπ ∣ ∣2∣ ∣i∣ ∣sin(0.3π)∣ = 5 × 1 × 2 × 1 × sin(0.3π) = 10 sin(0.3π).

Altogether then, z − w = 10 sin(0.3π)ei(−0.8π) .

Page 402, Table of Contents

www.EconsPhDTutor.com

In general, Fact 54. eiθ + eiφ = 2 cos

θ − φ i( θ+φ +2kπ) θ − φ i( θ+φ+π +2mπ) and eiθ − eiφ = 2 sin , e 2 e 2 2 2

where k, m = −1, 0, 1 are to ensure that 0.5(θ + φ) + 2kπ ∈ (−π, π] and 0.5(θ + φ + π) + 2mπ ∈ (−π, π]. Proof. See Exercise 161.

Exercise 160. Let z = 3ei(0.2π) and w = 3ei(−0.9π) . By mimicking the steps in Example 406, find z + w and z − w in exact polar and exponential forms. (Answer on p. 1041.) Exercise 161. Prove Fact 54. (Answer on p. 1042.)

SYLLABUS ALERT If you’re taking the 9758 (revised) exam, you are done with Part IV: Complex Numbers. The remaining chapters in Part IV covers the following, which are on the 9740 (old) syllabus but not on the 9758 (revised) syllabus: • geometrical effects of conjugating a complex number and of adding, subtracting, multiplying, dividing two complex numbers • loci such as ∣z − c∣ ≤ r, ∣z − a∣ ≤ ∣z − b∣ and arg (z − a) = α • use of de Moivre’s theorem to find the powers and nth roots of a complex number.

Page 403, Table of Contents

www.EconsPhDTutor.com

41

Geometry of Complex Numbers

In secondary school, we learnt to do some geometry using cartesian equations. And in Part III (Vectors), we learnt to do some geometry using vector equations. Now, we’ll learn to do some geometry using complex equations!

41.1

The Sum and Difference of Two Complex Numbers

Given two complex numbers z = x + iy and w = a + ib, their sum is simply the complex number z + w = (x + a) + (y + b)i. We already know how to interpret z = (x, y) and w = (a, b) as points on the plane. This gives us a nice geometric interpretation: z +w = (x+a, y +b) is likewise a point on the plane.

We can also interpret z = (x, y) and w = (a, b) as position vectors. And thus as usual, the sum of two vectors is itself a vector: z + w = (x + a, y + b).

y

z + w = (x + a, y + b) z = (x, y)

w

z+w

w = (a, b)

x

Page 404, Table of Contents

www.EconsPhDTutor.com

Similarly, their difference z − w is simply the point (x − a, y − b). This corresponds to the vector z − w = (x − a)i + (y − b)j.

y

z = (x, y) z - w = (x - a, y - b)

z-w

w = (a, b) x

Note that in general, ∣z + w∣ ≠ ∣z∣ + ∣w∣ or that ∣z − w∣ ≠ ∣z∣ − ∣w∣. This is perhaps obvious from the above figures and also bearing in mind Corollary 3 (the sum of the lengths of any two sides of a triangle is always greater than the length of the third side).

Page 405, Table of Contents

www.EconsPhDTutor.com

41.2

The Product and Ratio of Two Complex Numbers

With sums and differences, there was an exact analogy to vectors. In contrast, with products and ratios of complex numbers, there is no analogy to vectors. In particular, the product of two complex numbers has nothing to do with the scalar product or vector product of their position vectors. Nonetheless, we do have nice geometric interpretations. We already know from Fact 51 that the product of two complex numbers z and w is simply the complex number zw with 1. ∣zw∣ = ∣z∣ ∣w∣; and

2. arg (zw) = arg z + arg w + 2kπ, where k = −1, 0, 1 ensures that arg z + arg w + 2kπ ∈ (−π, π].

So geometrically, to get zw, we take z and

1. First multiply its length by a factor equal to the length of w; 2. Then rotate it anti-clockwise by the angle arg w.

y zw = (ac - bd, ad + bc)

z = (a, b)

w = (c, d) x

Page 406, Table of Contents

www.EconsPhDTutor.com

Similarly, we already know from Fact 51 that the ratio of complex numbers z to w is simply the complex number z/w with 1. ∣z/w∣ = ∣z∣ / ∣w∣; and

2. arg (z/w) = arg z −arg w +2kπ, where k = −1, 0, 1 ensures that arg z −arg w +2kπ ∈ (−π, π]. So geometrically, to get z/w, we take z and

1. First compress its length by a factor equal to the length of w; 2. Then rotate it anti-clockwise by the angle − arg w. (Or equivalently, clockwise by the angle arg w.)

y

z = (a, b)

w = (c, d)

x

Page 407, Table of Contents

www.EconsPhDTutor.com

41.3

Conjugating a Complex Number

If z = x + yi = (x, y), then z ∗ = x − yi = (x, −y). So the geometric effect of conjugating a complex number is simply to reflect it in the horizontal axis. Fact 55. Complex conjugate. Let z be a complex number. Then ∣z ∗ ∣ = ∣z∣ ,

and

arg (z ∗ ) = − arg z + 2kπ,

where k = −1, 0, 1 ensures that − arg z + 2kπ ∈ (−π, π]. Proof. Let z = r(cos θ + i sin θ). Then

z∗ = r(cos θ − i sin θ) = r [cos (−θ) − i sin (−θ)] .

This is the complex number with modulus r and angle −θ with the positive x-axis.

So given z = r(cos θ + i sin θ) = reiθ , its conjugate is simply

y

z ∗ = r [cos (−θ) + i sin (−θ)] = rei(−θ) . z* = (x, -y)

x

z = (x, y)

Page 408, Table of Contents

www.EconsPhDTutor.com

42

Loci Involving Cartesian Equations

A locus (plural: loci) is a set of points that satisfy some condition (or conditions). We’ve actually already encountered plenty of loci in Part I (Functions and Graphs), so this is nothing new. This chapter reviews loci involving cartesian equations (and inequalities). The goal is to prepare you for the next chapter, where we look at loci involving complex equations (and inequalities).

42.1

Circles

Example 407. {(x, y) ∶ x2 + y 2 = 1} is the set of all points (x, y) in the cartesian plane that satisfy the condition x2 + y 2 = 1. Graphically, this locus describes describing the unit circle centred on the origin. (To be clear, it includes only the circumference of the circle.)

y

{(x, y): x2 + y2 = 1}

x

Page 409, Table of Contents

www.EconsPhDTutor.com

Example 408. The locus {(x, y) ∶ x2 + y 2 ≤ 1} describes the entire interior of the unit circle centred on the origin, including the circumference of the circle.

{(x, y):

x2

+

y2

≤ 1}

y

x

Page 410, Table of Contents

www.EconsPhDTutor.com

Example 409. The locus {(x, y) ∶ x2 + y 2 < 1} describes the entire interior of the unit circle centred on the origin, excluding the circumference of the circle.

{(x, y):

x2

+

y2

< 1}

y

x

Page 411, Table of Contents

www.EconsPhDTutor.com

Example 410. The locus {(x, y) ∶ x2 + y 2 ≥ 1} describes everything outside the unit circle centred on the origin, including the circumference of the circle.

y

{(x, y): x2 + y2 ≥ 1} x

Page 412, Table of Contents

www.EconsPhDTutor.com

Example 411. The locus {(x, y) ∶ x2 + y 2 > 1} describes everything outside the unit circle centred on the origin, excluding the circumference of the circle.

y

{(x, y): x2 + y2 > 1} x

Exercise 162. Sketch the following loci: (Answer on p. 1044.) (a) {(x, y) ∶ (x − a)2 + (y − b)2 = r2 }. (b) {(x, y) ∶ (x − a)2 + (y − b)2 ≤ r2 }. (c) {(x, y) ∶ (x − a)2 + (y − b)2 < r2 }. (d) {(x, y) ∶ (x − a)2 + (y − b)2 ≥ r2 }. (e) {(x, y) ∶ (x − a)2 + (y − b)2 > r2 }.

Page 413, Table of Contents

www.EconsPhDTutor.com

42.2

Lines

Example 412. The locus {(x, y) ∶ y = x} describes the line y = x.

y

{(x, y): y = x}

x

Page 414, Table of Contents

www.EconsPhDTutor.com

Example 413. The locus {(x, y) ∶ y ≤ x} describes the set of all points under the line y = x, including the line itself. It contains literally half the plane, so we call this a halfplane. We can also specify that this is a closed half-plane — the word closed means that it includes also the line y = x.

Graphically, the locus {(x, y) ∶ y < x} describes the set of all points under the line y = x, excluding the line itself. This is an open half-plane. The word open means that it excludes the line y = x. y

y

{(x, y): y ≤ x }

{(x, y): y < x }

x

x

Example 414. Graphically, the locus {(x, y) ∶ y ≥ x} describes the set of all points above the line y = x, including the line itself. Again, this is a closed half-plane.

Graphically, the locus {(x, y) ∶ y > x} describes the set of all points above the line y = x, but excluding the line itself. Again, this is an open half-plane. y

y

x

{(x, y): y ≥ x}

Page 415, Table of Contents

x

{(x, y): y > x}

www.EconsPhDTutor.com

The locus of points that are equidistant to two points is simply a line. Example 415. Let (a, b) and (c, d) be points. The locus of points that are equidistant to (a, b) and (c, d) is the line illustrated below. This is because if you pick any point (e.g. P ) on the line, it is indeed equidistant to (a, b) and (c, d). And if you pick any point (e.g. Q) not on the line, it must be either closer to (a, b) or closer to (c, d) — in this case, Q is closer to (a, b) than to (c, d).

Any point not on the line must be closer to one of the two points. Q

y P Any point on the line is equidistant to the two points.

(a, b) x

(c, d)

Page 416, Table of Contents

www.EconsPhDTutor.com

Let (a, b) and (c, d) be points. We now prove that the locus {(x, y) ∶ ∣(x − a, y − b)∣ = ∣(x − c, y − d)∣} simply describes a line: ⇐⇒ ⇐⇒ ⇐⇒ ⇐⇒

∣(x − a, y − b)∣ = ∣(x − c, y − d)∣ (x − a)2 + (y − b)2 = (x − c)2 + (y − d)2 x2 − 2ax + a2 + y 2 − 2by + b2 = x2 − 2cx + c2 + y 2 − 2dy + d2 −2ax + a2 − 2by + b2 = −2cx + c2 − 2dy + d2

2(d − b)y + 2(c − a)x + a2 + b2 − (c2 + d2 ) = 0.

This last equation is of the form αx + βy + γ = 0 — this is simply a line.46 Exercise 163. (a) Find the cartesian equation of the line that is equidistant to the points (1, 4) and (−5, 0).

(b) Describe in words the set {(x, y) ∶ ∣(x − 17, y − 3)∣ = ∣(x + 2, y + 11)∣}. Then rewrite the cartesian equation ∣(x − 17, y − 3)∣ = ∣(x + 2, y + 11)∣ into the form ay + bx + c = 0. (Answer on p. 1046.)

46

If we’d like, we can further simplify this equation. If d − b ≠ 0, then it can be rewritten as

y=

c2 + d2 − (a2 + b2 ) a−c x+ . d−b 2(d − b)

And if d − b = 0, then this is a vertical line whose equation may be rewritten as

x=

Page 417, Table of Contents

c2 + d2 − (a2 + b2 ) 2(c − a)

.

www.EconsPhDTutor.com

42.3

Intersection of Lines and Circles

Example 416. {(x, y) ∶ x2 + y 2 = 1, y = x} is the set of points satisfying two conditions, namely the equation x2 +y 2 = 1 and the equation y = x. This locus describes the intersection points of the circle x2 + y 2 = 1 and the line y = x. By plugging the equation of the line into the equation of the circle, we can show that this locus consists of only two points: √ √ √ √ 2 2 2 2 {(x, y) ∶ x + y = 1, y = x} = {(− ,− ),( , )}. 2 2 2 2 2

2

y

{(x, y): y = x}

{(x, y): x2 + y2 = 1}

x

{(x, y): y = x, x2 + y2 = 1}

Page 418, Table of Contents

www.EconsPhDTutor.com

Example 417. {(x, y) ∶ x2 + y 2 ≤ 1, y = x} is the portion of the line y = x that is within the interior of the circle, including the endpoints. It is illustrated in green in the figure below.

y

{(x, y): y = x}

{(x, y): x2 + y2 ≤ 1}

x

{(x, y): y = x, x2 + y2 ≤ 1}

Page 419, Table of Contents

www.EconsPhDTutor.com

Example 418. The locus {(x, y) ∶ x2 + y 2 > 1, y > x} describes the region above both the circle x2 + y 2 = 1 and y = x, excluding the circumference of the circle and the line. It is illustrated in green in the figure below.

y

{(x, y): y = x} x

{(x, y): x2 + y2 = 1}

{(x, y): y > x, x2 + y2 > 1}

Exercise 164. Sketch on a cartesian plane the locus {(x, y) ∶ x2 + y 2 = 1, x > 0}. (Answer on p. 1047.) We now turn to loci involving complex equations (and inequalities).

Page 420, Table of Contents

www.EconsPhDTutor.com

43

Loci Involving Complex Equations 43.1

Circles

On an Argand diagram (or complex plane), the locus {z ∈ C ∶ ∣z∣ = 1} simply describes the unit circle centred on the origin, as we now prove: √ √ Let z = (x, y). Then ∣z∣ = x2 + y 2 and so the equation ∣z∣ = 1 is equivalent to x2 + y 2 = 1 or x2 + y 2 = 1. Hence, {z ∈ C ∶ ∣z∣ = 1} = {(x, y) ∶ x2 + y 2 = 1} .

But we already saw in the previous chapter that the locus {(x, y) ∶ x2 + y 2 = 1} describes the unit circle centred on the origin.

Loci involving complex equations (or inequalities) can usually be easily transformed into a familiar cartesian equation (or inequality).

y {z : |z | = 1} = {(x, y): x2 + y2 = 1}

x

Page 421, Table of Contents

www.EconsPhDTutor.com

Exercise 165. (a) Prove that the locus {z ∈ C ∶ ∣z∣ = r} describes the circle of radius r centred on the origin.

(b) Let c be some fixed complex number. Prove that the locus {z ∈ C ∶ ∣z − c∣ = r} is the circle of radius r centred on the point c. (c) What does the locus {z ∈ C ∶ ∣z − c∣ ≤ r} describe?

(d) What does the locus {z ∈ C ∶ ∣z − c∣ < r} describe? (Answer on p. 1048.)

Page 422, Table of Contents

www.EconsPhDTutor.com

43.2

Lines

Let b and c be fixed complex numbers. The equation ∣z − c∣ = ∣z − b∣ is simply the condition that z is equidistant to b and c.

Hence, the locus {z ∈ C ∶ ∣z − c∣ = ∣z − b∣} simply describes the points that are equidistant to b and c. And as we showed earlier, such a locus is simply a line.

Any point not on the line must be closer to one of the two points. e

y d Any point on the line is equidistant to the two points.

b x

c {z : |z – b | = |z – c |}

Exercise 166. Let b and c be fixed complex numbers. What is the locus of complex numbers z that satisfy each of the following inequalities? (a) ∣z − c∣ ≤ ∣z − b∣. (b) ∣z − c∣ < ∣z − b∣. (c) ∣z − c∣ ≥ ∣z − b∣. (d)∣z − c∣ > ∣z − b∣. (Answer on p. 1048.)

Page 423, Table of Contents

www.EconsPhDTutor.com

43.3

Rays

The locus {z ∈ C ∶ arg z = α} describes the set of points z whose argument is α. It is thus the ray (or half-line) which starts from but excludes the origin and which makes an angle α with the positive x-axis. The figure below illustrates. The point A is in the locus, because indeed arg A = α. In contrast, the point B is not in the locus, because its argument is not arg B ≠ α.

Note importantly that points along the dotted red ray, such as C, are not in the locus, because arg C = α − π ≠ α. Moreover, the origin is not in the locus, because arg 0 is undefined.

y

{z : arg z = Ƚ}

B

A Ƚ x C

If we really wanted to, we could rewrite the complex equation arg z = α into cartesian form. But it turns out that in this case, the cartesian form is more complicated. And so we’ll just stick with the equation arg z = α.

Page 424, Table of Contents

www.EconsPhDTutor.com

Let a be a complex number. Then the graph of arg (z − a) = α is simply the translation of the graph of arg z = α. And so arg (z − a) = α is the ray (or half-line) which starts from but excludes the point a and which makes an angle α with the positive x-axis.

The point b is in the locus, because indeed arg(b − a) = α. In contrast, the point c is not in the locus, because its argument is not arg(c − a) ≠ α.

Note importantly that points along the dotted red ray, such as d, are not in the locus, because arg(d − a) = α − π ≠ α. Moreover, the point a is not in the locus, because arg(a − a) = arg 0 is undefined.

y

c

{z : arg (z – a) = Ƚ} b

Ƚ a d

Page 425, Table of Contents

x

www.EconsPhDTutor.com

The 9740 syllabus doesn’t mention loci of the form α ≤ arg z ≤ β. Unfortunately, such loci have occasionally appeared on the A-level exams,47 which means you have to learn it.

The locus α ≤ arg z ≤ β is simply the region bounded by (and including) the rays arg z = α and arg z = β.

y

{z : arg z = Ⱦ} Ƚ {z : arg z = Ƚ}

Ⱦ x

Exercise 167. What is {z ∈ C ∶ ∣z∣ = 1, −π < arg z < 0}? (Answer on p. 1048.)

47

See Exercises 361 (2013), 365 (2011), and 371 (2008).

Page 426, Table of Contents

www.EconsPhDTutor.com

43.4

Quick O-Level Revision: Properties of The Circle

Definition 100. A chord is a line segment connecting any two points on a circle’s circumference. Here are a few properties of the circle (which you are supposed to still remember from O-levels) and which would definitely have been useful in some complex loci questions in the past ten years’ A-levels. Fact 56. Let A be a point exterior to a circle. Let B and C be the points at which the tangents from A touch the circle. Let O be the centre of the circle. (a) The line through A and O (i) bisects the angle ∠BAC; (ii) is the perpendicular bisector of the chord BC; and (iii) passes through the points D and E, which are the points on the circle that are respectively that closest to and furthest from A. (b) The lengths AB and AC are equal. (c) The angles ∠OBA and ∠OCA are right.

Perpendicular bisector of chord B

Chord E O

A

D

Tangents C

Proof. See p. ??? of my O-Level Maths Textbook (coming soon)! Here’s an example that illustrates the uses of the above properties of the circle. Page 427, Table of Contents

www.EconsPhDTutor.com

Example 419. The complex number z satisfies the equation ∣z + 4 + 2i∣ = 1. (a) What are the maximum and minimum possible values of ∣z∣? (b) For what values of z is ∣z∣ maximised and minimised? ∣z + 4 + 2i∣ = 1 describes a unit circle centred on the point C = (−4, −2). Even if not asked for, you should make a quick sketch to help yourself see better.

By the above fact, ∣z∣ is maximised at F and minimised at N , where F and N lie on the line through the origin and the circle’s centre. √ √ (a) The maximum value of ∣z∣ is the length OF = OC + CF = (−4)2 + (−2)2 + 1 = 20 + 1. √ √ The minimum value of ∣z∣ is the length ON = OC − CN = (−4)2 + (−2)2 − 1 = 20 − 1. (b) Consider △CAN . The line through F , C, N , and the origin is y = 0.5x. So AN = 0.5CA. Moreover, CA2 + AN 2 = CN 2 = 12 = 1. 2 1 4 or CA = √ . And AN = √ . Hence, 5 5 5 2 1 N = (−4 + √ , −2 + √ ) . 5 5

Altogether then, CA2 + 0.25CA2 = 1 or CA2 = 2 1 Symmetrically, F = (−4 − √ , −2 − √ ). 5 5

y O

|z + 4 + 2i | = 1

x

U N

C

A

F D

y = 0.5 x (Line through the origin and the centre of the circle.)

(... Example continued on the next page ...)

Page 428, Table of Contents

www.EconsPhDTutor.com

(... Example continued from the previous page ...) z satisfies ∣z + 4 + 2i∣ = 1. (c) What are the maximum and minimum possible values of arg z? (d) For what values of z is arg z maximised and minimised?

(c) The points U and D at which arg z is maximised and minimised are also where the tangents OU and OD from the origin touch the circle. By the above fact, OU is perpendicular to CU . Similarly, OD is perpendicular to CD. The angle the lower half of the√line y = 0.5x makes with the positive x-axis is θ = tan−1 0.5−π. √ The angle ∠COU is sin−1 (1/ 20). Hence, arg U = θ+∠COU = tan−1 0.5−π−sin−1 (1/ 20). √ Symmetrically, arg D = θ − ∠COD = θ − ∠COU = sin−1 0.5 − π + tan−1 (1/ 20). √ √ (d) △ODC is right. So OD2 +CD2 = OC 2 . OC = 20 and CD = 1. Hence, OD = 20 − 1 = √ √ √ 19. Altogether then ∣D∣ = 19 and arg D = tan−1 0.5 − π + sin−1 (1/ 20). √ √ Symmetrically, we also have ∣U ∣ = 19 and arg U = sin−1 0.5 − π + tan−1 (1/ 20). (Figure reproduced for convenience.)

y O

|z + 4 + 2i | = 1

x

U N

C

A

F D

y = 0.5 x (Line through the origin and the centre of the circle.)

Exercise 168. The complex number z satisfies the equation ∣z − 2 − 2i∣ = 1. (a) What are the maximum and minimum possible values of ∣z∣? (b) For what values of z is ∣z∣ maximised and minimised? (c) What are the maximum and minimum possible values of arg z? (d) For what values of z is arg z maximised and minimised? (Answer on p. 1049f.)

Page 429, Table of Contents

www.EconsPhDTutor.com

44

De Moivre’s Theorem

Theorem 7. De Moivre’s Theorem. (cos θ + i sin θ) = cos (nθ) + i sin (nθ). n

Proof. cos θ + i sin θ is the complex number with modulus 1 and argument θ. So by Fact 51, n (cos θ + i sin θ) is the complex number with modulus 1n = 1 and argument nθ + 2kπ (where k is the unique integer such that nθ + 2kπ is a principal value) — this complex number can be written as cos (nθ) + i sin (nθ). Here is an alternative proof that uses the Euler Formula:

Proof. (cos θ + i sin θ) = (eiθ ) = ei(nθ) = cos (nθ) + i sin (nθ), where = and = use the Euler n 2

n 1

3

1

3

Formula (Theorem 6) and = uses the law of exponents (xa ) = xab , which applies even when a is imaginary. 2

b

This is a totally free, bonus exercise on mathematical induction! Exercise 169. Prove de Moivre’s Theorem using the method of mathematical induction. (Answer on p. 1051.)

As stated, de Moivre’s Theorem applies only to complex numbers with modulus 1. It is easy to rewrite it so that it applies more generally to any complex number with modulus r: Corollary 5. [r (cos θ + i sin θ)] = rn [cos (nθ) + i sin (nθ)]. n

Or equivalently, (reiθ ) = rn ei(nθ) . n

Or equivalently, if ∣z∣ = r and arg z = θ, then ∣z n ∣ = rn and arg z n = nθ + 2kπ (where k is the unique integer such that nθ + 2kπ is a principal value).

Page 430, Table of Contents

www.EconsPhDTutor.com

44.1

Powers of a Complex Number

On the Argand diagram, the powers of a complex number form a “spiral”. Example 420. Let z = 1 + i. Then: ∣z∣ =

√ 2,

arg z =

π π + 2kπ = (k = 0), 4 4

√ 2 ∣z 2 ∣ = ( 2) = 2,

π π arg z 2 = 2 ( ) + 2kπ = (k = 0), 4 2

√ 4 ∣z 4 ∣ = ( 2) = 4,

π arg z 4 = 4 ( ) + 2kπ = π (k = 0), 4

√ 3 √ ∣z 3 ∣ = ( 2) = 2 2,

√ 5 √ ∣z 5 ∣ = ( 2) = 4 2,

π 3π arg z 3 = 3 ( ) + 2kπ = (k = 0), 4 4

3π π arg z 5 = 5 ( ) + 2kπ = − (k = −1), 4 4

etc.

The powers of z = 1 + i, up to the 13th, are illustrated in the figure below.

Page 431, Table of Contents

www.EconsPhDTutor.com

Example 421. Let z = 1 + 0.4i. Then: ∣z∣ =

√ √ 12 + 0.42 = 1.16,

arg z = tan−1 (0.4) + 2kπ,

∣z 2 ∣ = 1.16,

arg z 2 = 2 tan−1 (0.4) + 2kπ,

∣z 4 ∣ = 1.162 ,

arg z 4 = 4 tan−1 (0.4) + 2kπ,

√ ∣z 3 ∣ = 1.16 1.16, √ ∣z 5 ∣ = 1.162 1.16,

arg z 3 = 3 tan−1 (0.4) + 2kπ, arg z 5 = 5 tan−1 (0.4) + 2kπ,

etc.

The powers of z = 1 + 0.4i, up to the 14th, are illustrated in the figure below.

Page 432, Table of Contents

www.EconsPhDTutor.com

Exercise 170. (a) Given z = 3 − 4i, find ∣z∣ and arg z. Hence find ∣z 7 ∣ and arg z 7 . Write down (3 − 4i)7 in exponential form. (b) Given z = −5+12i, find ∣z∣ and arg z. Hence find ∣z 8 ∣ and arg z 8 . Write down (−5+12i)8 in exponential form. (Answer on p. 1051.)

Exercise 171. For each of the given values of z, compute z 10 , expressing your answer in all three forms (polar, exponential, and standard). (a) z = −1 − i. (b) z = 2 + i. (c) z = 1 − 3i. (Answer on p. 1052.)

Page 433, Table of Contents

www.EconsPhDTutor.com

44.2

Roots of a Complex Number

Example 422. What are the roots to the equation z 3 = 1 + i? That is, for what values of z is the given equation true? a

A naïve application of de Moivre’s Theorem might suggest that ∣z 3 ∣ = 21/2 and arg z 3 = π/4

Ô⇒

∣z∣ = (21/2 )

1/3

= 21/6 and arg z = (π/4)/3 = π/12.

This is not incorrect, but it gives us only one root to the equation z 3 = 1 + i, namely z = 21/6 ei(π/12) . In contrast, the Fundamental Theorem of Algebra tells us that since the equation z 3 = 1 + i involves a degree-3 polynomial, it should have 3 roots. We’ve just found one root. How do we find the other two?

The trick is to recognise that z 3 = 21/2 eiπ/4 can also be written as z 3 = 21/2 ei(π/4+2kπ) , for any integer k. This is because if you plug in any integer k, you will always get 21/2 ei(π/4+2kπ) = 21/2 ei(π/4) . The reason is that ei(2π) = 1.

We then have z = (z 3 ) = [21/2 ei(π/4+2kπ) ] = 21/6 ei(π/4+2kπ)/3 , for any integer k. Now in contrast to before, different integers k will yield us distinct values for z = 21/6 ei(π/4+2kπ)/3 . In particular, if we pick values of k so that the values of (π/4 + 2kπ) /3 are principal values, that is, if we pick k = 0, ±1, we have 1/3

z = 21/6 ei(π/12) ,

1/3

21/6 ei(11π/12) ,

21/6 ei(−7π/12) .

Observe that beautifully enough, the roots of the equation z 3 = 1 + i lie on a circle — in particular, the circle of radius 21/6 centred on the origin. Moreover, each root can be obtained by rotating another root 2π/3 radiant about the origin.

y

x

Page 434, Table of Contents

www.EconsPhDTutor.com

∣z n ∣ arg z n + 2kπ In general, given z , we have ∣z∣ = and arg z = , where k are those integers n n n arg z such that arg z = + 2kπ ∈ (−π, π]. n n

The annoying part is to figure out the appropriate values of k. So here’s how to do it: 1. If n is odd, then simply pick k = 0, ±1, ±2, . . . , ± 0, ±1, ±2, . . . , ±7.)

n−1 . (E.g., if n = 15, then pick k = 2

n 2. If n is even AND arg z n > 0, then simply pick k = 0, ±1, ±2, . . . , − . (E.g., if n = 16 and 2 arg z n > 0, then pick k = 0, ±1, ±2, . . . , ±7, −8.) n 3. If n is even AND arg z n ≤ 0, then simply pick k = 0, ±1, ±2, . . . , . (E.g., if n = 16 and 2 arg z n ≤ 0, then pick k = 0, ±1, ±2, . . . , ±7, 8.)

You can easily verify that in each case, we do indeed have n roots (just count them). See Fact 90 in the Appendices for a proof (or explanation) of why the above values of k ensure that we have k distinct principal values for arg z.

More examples ...

Page 435, Table of Contents

www.EconsPhDTutor.com

Example 423. Consider z 4 = −5 + 12i.

We have z 4 = 13ei[π−tan (12/5)+2kπ] , for k ∈ Z. Hence, z = 131/4 ei[π−tan Since 4 is even and arg z 4 > 0, we should pick k = 0, ±1, −2 to get −1

−1 ⎧ ⎪ 131/4 ei[π−tan (12/5)+2kπ]/4 , ⎪ ⎪ ⎪ ⎪ ⎪ −1 ⎪ ⎪ ⎪131/4 ei[3π−tan (12/5)]/4 , z=⎨ ⎪ 1/4 i[−π−tan−1 (12/5)]/4 ⎪ 13 e , ⎪ ⎪ ⎪ ⎪ ⎪ 1/4 i[−3π−tan−1 (12/5)]/4 ⎪ ⎪ , ⎩13 e

−1

(12/5)+2kπ]/4

, for k ∈ Z.

(k = 0), (k = 1),

(k = −1),

(k = −2).

y

x

Page 436, Table of Contents

www.EconsPhDTutor.com

Example 424. Consider z 7 = 3 − 4i. We have z 7 = 5ei[tan (−4/3)+2kπ] , for k ∈ Z. Hence, −1 z = 51/7 ei[tan (−4/3)+2kπ]/7 , for k ∈ Z. Since 7 is odd, we should pick k = 0, ±1, ±2, ±3. −1

Now consider w8 = 3 − 4i. We have w8 = 5ei[tan (−4/3)+2mπ] , for m ∈ Z. Hence, w = −1 51/8 ei[tan (−4/3)+2mπ]/8 , for m ∈ Z. Since 8 is even and arg w8 ≤ 0, we should pick m = 0, ±1, ±2, ±3, −4. −1

Altogether then, the possible values of z and w are given by: ⎧ −1 ⎪ ⎪ 51/7 ei[tan (−4/3)]/7 , ⎪ ⎪ ⎪ ⎪ −1 ⎪ ⎪ ⎪ 51/7 ei[tan (−4/3)+2π]/7 , ⎪ ⎪ ⎪ ⎪ −1 ⎪ ⎪ ⎪ 51/7 ei[tan (−4/3)+4π]/7 , ⎪ ⎪ ⎪ ⎪ −1 z= ⎨51/7 ei[tan (−4/3)+6π]/7 , ⎪ ⎪ ⎪ ⎪ 1/7 i[tan−1 (−4/3)−2π]/7 ⎪ ⎪ 5 e , ⎪ ⎪ ⎪ ⎪ −1 ⎪ ⎪ 51/7 ei[tan (−4/3)−4π]/7 , ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ 1/7 i[tan−1 (−4/3)−6π]/7 ⎪ 5 e , ⎪ ⎩

𝑚=3 𝑘=3 𝑚=4

(k = 0), (k = 1), (k = 2), (k = 3),

(k = −1), (k = −2), (k = −3),

⎧ 1/8 i[tan−1 (−4/3)]/8 ⎪ ⎪ 5 e , ⎪ ⎪ ⎪ ⎪ ⎪ 1/8 i[tan−1 (−4/3)+2π]/8 ⎪ ⎪ 5 e , ⎪ ⎪ ⎪ ⎪ −1 ⎪ ⎪ 51/8 ei[tan (−4/3)+4π]/8 , ⎪ ⎪ ⎪ ⎪ ⎪ −1 ⎪ ⎪ ⎪51/8 ei[tan (−4/3)+6π]/8 , and w= ⎨ −1 ⎪ ⎪ 51/8 ei[tan (−4/3)−2π]/8 , ⎪ ⎪ ⎪ ⎪ −1 ⎪ ⎪ ⎪ 51/8 ei[tan (−4/3)−4π]/8 , ⎪ ⎪ ⎪ ⎪ ⎪ 1/8 i[tan−1 (−4/3)−6π]/8 ⎪ ⎪ 5 e , ⎪ ⎪ ⎪ ⎪ −1 ⎪ 1/8 i[tan (−4/3)−8π]/8 ⎪ ⎪ , ⎩5 e

𝑦 𝑘=2

𝑘=1

51/7

𝑘 = −3

𝑘 = −2

(m = 1), (m = 2), (m = 3),

(m = −1),

(m = −2),

(m = −3),

(m = −4).

𝑚=2

2𝜋 7

𝑚 = −3

(m = 0),

𝑚 = −2

𝑚=1 1 4 −1 tan − 7 3 𝑚=0 𝑥 𝑘=0

𝑚 = −1 𝑘 = −1

Notice that the eight possible values of w are on a circle whose radius is just slightly shorter than the red circle. (Only the red circle is illustrated.)

Page 437, Table of Contents

www.EconsPhDTutor.com

Exercise 172. Find the roots of each of the following equations. (Answer on p. 1053.) (a) z 10 = −1 − i.

Page 438, Table of Contents

(b) z 11 = 2 + i.

(c) z 12 = 1 − 3i.

www.EconsPhDTutor.com

Part V

Calculus

Page 439, Table of Contents

www.EconsPhDTutor.com

45

Solving Problems Involving Differentiation

Part I already covered differentiation. This chapter merely ties up some loose ends.

45.1

Inverse Function Theorem (IFT)

The Inverse Function Theorem (IFT) simply says that “The change in y caused by a small unit change in x (dy/dx)” is the inverse of “the change in x caused by a small unit change in y (dx/dy)”.48 That is, dy 1 = . dx dx dy Example 425. Suppose that adding 1 g of Milo (the x-variable) to a cup of water increases the volume of water by 2 cm3 (the y-variable). That is, dy/dx = 2 cm3 g-1 .

Then dx/dy = 0.5 g cm-3 . That is, if instead we had wanted to increase the volume of water by 1 cm3 , we should have added 0.5 g of Milo to the water. Here’s a more typical use of the IFT:

Example 426. Let x ∈ [−π/2, π/2]. Let y = sin x. Suppose we wish to find dx/dy in terms of x. Method #1 (longer method using Corollary 2 ). y = sin x Ô⇒ x = sin−1 y. So dx d 1 1 1 = sin−1 y = √ =√ = dy dy 1 − y2 1 − sin2 x cos x

Method #2 (quicker method using the IFT).

dy dx 1 = cos x Ô⇒ = . dx dy cos x

Exercise 173. Suppose x2 y + sin x = 0. Find

dy dx . Hence write down . (You may leave dx dy your answers expressed in terms of x and y.) (Answer on p. 1054.)

48

This is informal. For the formal statement of the IFT (optional), see p. 895 in the Appendices.

Page 440, Table of Contents

www.EconsPhDTutor.com

45.2

Differentiation of Simple Parametric Functions

“Informal Fact”.

dy dy dx dx = ÷ (provided ≠ 0). dx dt dt dt

Here is an informal “proof” of the above informal fact. By the Chain Rule, dy 1 dy dt = . dx dt dx

By the IFT,

dt 2 1 = . dx dx dt Plugging = into = yields the desired result: 2

1

dy dy dx = ÷ . dx dt dt

See p. 896 in the Appendices for a formal version of the above Fact.

Example 427. Let x = t5 + t and y = t6 − t. Find

dy ∣ . dx t=0

dy dy dx 6t5 − 1 = ÷ = . dx dt dt 5t4 − 1

dy ∣ = 1. It would be much more difficult (perhaps even impossible) if instead we first dx t=0 dy tried to express y in terms of x, then compute . dx

So

Exercise 174. Let x = cos t + t2 and y = et − t3 . Find Page 441, Table of Contents

dy . (Answer on p. 1054.) dx www.EconsPhDTutor.com

45.3

Equations of Tangents and Normals

Recall from secondary school the following two facts: Fact 57. The line with slope m through the point (a, b) has equation y − b = m(x − a). Fact 58. Given a line with slope m, its perpendicular has slope −

1 . m

Example 428. The curve C has parametric equations x = t5 + t and y = t6 − t, t ∈ R. Consider the normal line at the point where t = 0. Find any point(s) at which the normal line intersects the curve C again.

First, note that t = 0 Ô⇒ (x, y) = (0, 0). Next,

R R R dy dx RRRR 6t5 − 1 RRRR dy RRRR R = ÷ R = 4 R = 1. dx RRRR dt dt RRRR 5t − 1 RRRR Rt=0 Rt=0 Rt=0

So the tangent line at the point t = 0 or (0, 0) has slope 1. Thus, the normal line at this point has slope −1. Its equation is thus y − 0 = 1(x − 0) or more simply y = x.

The points where this normal line intersects the curve is thus given by the system of equations y = x, x = t5 + t, and y = t6 − t. Putting these together, we have t5 + t = t6 − t ⇐⇒ t (t5 − t4 − 2) = 0. So t = 0 or t ≈ 1.45 (calculator). (We know by the Fundamental Theorem of Algebra that there must be six roots altogether — in this case, only two are real, while the other four are complex.) So the normal line intersects the curve C again at the point where t ≈ 1.45 or where (x, y) ≈ (7.88, 7.88). Exercise 175. A curve C is described by the pair of parametric equations x = t5 + t and y = t4 − t. Find the tangent lines to the curve at the points where t = 0 and t = 1. Find the intersection point of these two tangent lines. (Answer on p. 1054.)

Page 442, Table of Contents

www.EconsPhDTutor.com

45.4

Connected Rates of Change Problems

Example 429. We unload sand onto a flat surface at a steady rate of 0.01 m3 s-1 . Assume the unloaded sand always forms a perfect cone whose height and base diameter are always equal. Let’s find the rate at which the base area of the cone is increasing, at the instant t = 20 s. First, recall that a cone with base radius r and height h has volume 1 V = πr2 h. 3

Since the base diameter equals the height (or h = 2r), we can rewrite this as 2 V = πr3 . 3

Now differentiate the above equation with respect to t, to get dr dV = 2πr2 . dt dt

Let A = πr2 be the base area. The rate at which the base area is increasing is dA dr dV = 2πr = ÷ r. dt dt dt

The volume of the sand is always increasing at a rate 0.01 m3 s-1 . That is: dV = 0.01 m3 s−1 . dt

3V 1/3 0.3 1/3 V ∣t=20 = 20 × 0.01 = 0.2 m . Hence, r∣t=20 = ( ) ∣ = ( ) m. Altogether then, t=20 2π π 3

dA 0.3 1/3 ∣ = 0.01 ÷ ( ) = 0.0219 m2 s−1 . t=20 dt π

Page 443, Table of Contents

www.EconsPhDTutor.com

Exercise 176. (Answer on p. 1055.) Illustrated below is a cone with lateral l, base radius r, and height h. You are given that such a cone has total external surface area 1 (excluding the base) πrl and volume πr2 h. 3

A manufacturer wishes to manufacture a cone whose volume is fixed at 1 m3 and whose total external surface area (excluding the base) is minimised. Find out what its height should be. (You can follow the steps below.) (a) Express r in terms of h. (b) Use the Pythagorean Theorem to express l in terms of r and h. Hence express l solely in terms of h. (c) Now express the total external surface area A (excludes the base) solely in terms of h. dA 3 π − h63 6 1/3 = . Hence conclude that the only stationary point is h = ( ) . (d) Show that dh 2 A π (e) Use the quotient rule to show that

d2 A 9 h4 A2 − (π − h3 ) = . dh2 4 A3 12

6

2

d2 A (f) Consider the numerator of . Replace A2 with the expression for A that you found 2 dh in (c). Now fully expand this numerator. Observe that it is a quadratic and prove that it is always positive. (g) Hence conclude that the stationary point we found is indeed the global minimum.

Page 444, Table of Contents

www.EconsPhDTutor.com

45.5

Finding Max/Min Points on the TI84

Example 430. Define f ∶ [0, 2] → R by x ↦ x−sin (0.5πx). We can easily find the minimum point of f analytically: df π π = 1 − cos ( x) = 0 dx 2 2

⇐⇒

π 2 cos ( x) = 2 π

But as an exercise, let’s find it using our TI84.

⇐⇒

x=

2 2 cos−1 ≈ 0.560664181. π π

After Step 1.

After Step 2.

After Step 3.

After Step 4.

After Step 5.

After Step 6.

1. Press ON to turn on your calculator. 2. Press Y= to bring up the Y= editor.

3. Press X,T,θ,n − SIN 0 . 5 . To enter “π”, press the blue 2ND button and then π

(which corresponds to the ∧ button). Now press X,T,θ,n ) and altogether you will have entered “x − sin(0.5πx)”.

4. Now press GRAPH and the calculator will graph y = x − sin(0.5πx).

Note that in the question given, the domain is actually [0, 2], but we didn’t bother telling the calculator this. So the calculator just went ahead and graphed the equation y = x − sin(0.5πx) for all possible real values of x and y. No big deal, all we need to do is to zoom in to the region where 0 ≤ x ≤ 2. 5. Press the (ZOOM) button to bring up a menu of ZOOM options.

6. Press 2 to select the Zoom In option. Using the < and > arrow keys, move the cursor to where X = 1.0638298, Y = 0. Now press ENTER and the TI will zoom in a little, centred on the point X = 1.0638298, Y = 0. (... Example continued on the next page ...)

Page 445, Table of Contents

www.EconsPhDTutor.com

(... Example continued from the previous page ...) It looks like starting at x = 0, the function is decreasing, then hits a minimum point, then keeps increasing. Our goal now is to find out what that minimum point is. After Step 7.

After Step 8.

After Step 9.

After Step 11.

After Step 12.

After Step 13.

After Step 10.

4. Press the blue 2ND button and then CALC (which corresponds to the TRACE button). This brings up the CALCULATE menu. 5. Press 3 to select the “minimum” option. This brings you back to the graph, with a cursor flashing. Also, the TI84 prompts you with the question: “Left Bound?” TI84’s MINIMUM function works by you first choosing a “Left Bound” and a “Right Bound” for x. TI84 will then look for the minimum point within your chosen bounds. 6. Using the < and > arrow keys, move the blinking cursor until it is where you want your first “Left Bound” to be. For me, I have placed it a little to the left of where I believe the minimum point to be. 7. Press ENTER and you will have just entered your first “Left Bound”. TI84 now prompts you with the question: “Right Bound?”. 8. So now just repeat. Using the < and > arrow keys, move the blinking cursor until it is where you want your first “Right Bound” to be. For me, I have placed it a little to the right of where I believe the minimum point to be. 9. Again press ENTER and you will have just entered your first “Right Bound”. TI84 now asks you: “Guess?” This is just asking if you want to proceed and get TI84 to work out where the minimum point is. So go ahead and: 10. Press ENTER . TI84 now informs you that there is a “Zero” at “X = .56066485”, “Y = −.2105137” and places the cursor at precisely that point. This is our desired minimum point.

(Notice there’s a slight error, because the TI84 uses slightly-imprecise numerical methods. Analytically, we found that the minimum point was x ≈ 0.560664181, while the TI84 claims it is “X = .56066485”.) Page 446, Table of Contents

www.EconsPhDTutor.com

45.6

Finding the Derivative at a Point on the TI84

This example will also illustrate how to graph parametric equations on the TI84. Example 431. The curve C has parametric equations x = t5 + t and y = t6 − t, t ∈ R. We’ll find dy/dx∣ using our TI84, even though this is easily found analytically: t=1

dy 6t5 − 1 5 ∣ = 4 ∣ = . dx t=1 5t + 1 t=1 6

After Step 1.

After Step 2.

After Step 3.

After Step 4.

After Step 5.

After Step 6.

After Step 7.

After Step 8.

1. Press ON to turn on your calculator. 2. Press MODE to bring up a menu of settings that you can play with. In this example, all we want is to plot a curve based on parametric equations. So: 3. Using the arrow keys, move the blinking cursor to the word PAR (short for parametric) and press ENTER . 4. Now as usual, we’ll input the equations of our curve. To do so, press Y= to bring up the Y= editor. Notice that this screen looks a little different from usual, because we are now under the parametric setting. 5. Press X,T,θ,n ∧ 5 + X,T,θ,n and altogether you will have entered “T 5 + T ” in the first line. 6. Now press ENTER to go to the second line.

7. Press X,T,θ,n ∧ 6 - X,T,θ,n and altogether you will have entered “T 6 − T ” in the second line. 8. Now press GRAPH and the calculator will graph the given pair of parametric equations. Notice that strangely enough, the graph seems to be empty for the region where x < 0. But clearly there are values for which x < 0 — for example, t = −1.1 Ô⇒ (x, y) ≈ (−2.71, 2.87). So why isn’t the TI84 graphing this? (... Example continued on the next page ...) Page 447, Table of Contents

www.EconsPhDTutor.com

(... Example continued from the previous page ...) The reason is that by default, the TI84 graphs only the region for where 0 ≤ t ≤ 2π (at least this is so for my particular calculator). We can easily adjust this: 4. Press the WINDOW button to bring up a menu of WINDOW options.

5. Using the arrow keys, the number pad, and the ENTER key as is appropriate, change Tmin and Tmax to your desired values. In my case, I decided somewhat randomly to enter Tmin = −10 and Tmax= 10. 6. Then press GRAPH again and the calculator will graph the given pair of parametric equations, now for the region Tmin ≤ t ≤ Tmax, where Tmin and Tmax are whatever you chose. After Step 9.

After Step 10.

After Step 11.

After Step 13.

After Step 14.

After Step 15.

After Step 12.

dy Actually, the last few steps were really not necessary, if all we wanted was to find ∣ , dx t=1 as we do now:

7. Press the blue 2ND button and then CALC (which corresponds to the TRACE button). This brings up the CALCULATE menu, which once again looks a little different under the current parametric setting. 8. Press 2 to select the “dy/dx” option. This brings you back to the graph. Nothing seems to be happening. But now, simply ... 9. Press 1 and now the bottom left of the screen changes to display “T = 1”.

10. Hit ENTER . What you’ve just done is to ask the calculator to calculate point where t = 1. The calculator tells you that “dy/dx = .83333528”.

dy at the dx

dy 5 Again, there’s a slight error — the exact correct answer is = = 0.8333..., so again the dx 6 TI84 is a tiny bit off.

Page 448, Table of Contents

www.EconsPhDTutor.com

46

The Maclaurin Series 46.1

Power Series

Recall what polynomials are: Example 432. 4 + x + 3x2 is a 2nd-degree polynomial. 18 + 5x − x2 + x4 is a 4th-degree polynomial. You can easily imagine what a “∞-degree polynomial” is. Only we don’t call it a “∞-degree polynomial”. Instead, we call it a power series. Definition 101. A power series is simply any infinite series ∞

∑ ai xi = a0 + a1 x + a2 x2 + . . . , i=0

where each ai is a real constant and x is the variable.

Example 433. 1 + 2x + 3x2 + 4x3 + 5x4 + 6x5 + . . . is a power series, with a0 = 1, a1 = 2, a2 = 3, . . . , ak = k + 1, . . . So too is 1 − x + x2 − x3 + x4 − x5 + . . . , with a0 = 1, a1 = −1, a2 = 1, . . . , ak = (−1)k+1 , . . .

As we learnt before, a series can either be convergent or divergent.

Example 434. 1 + x + x2 + x3 + x4 + x5 + . . . is a power series, with a0 = a1 = ⋅ ⋅ ⋅ = ak = ⋅ ⋅ ⋅ = 1. It is, moreover, a convergent power series, provided ∣x∣ < 1. Indeed, provided ∣x∣ < 1, we 1 know that this is an infinite geometric series that converges to and we may write 1−x 1 + x + x2 + x3 + x4 + x5 + ⋅ ⋅ ⋅ =

1 . 1−x

In contrast, if ∣x∣ ≥ 1, then 1 + x + x2 + x3 + x4 + x5 + . . . is a divergent power series. For H2 Maths, the only power series we’ll be interested in is called the Maclaurin series.

Page 449, Table of Contents

www.EconsPhDTutor.com

46.2

Maclaurin Series

Definition 102. Let f be an infinitely-differentiable function. The Maclaurin series of f at x is denoted M (x) and is defined to be the power series M (x) = a0 + a1 x + a2 x2 + ⋅ ⋅ ⋅ + an xn + . . . ,

where a0 = f (0), a1 = f ′ (0), a2 =

f ′′ (0) f (3) (0) f (n) (0) , a3 = , ..., an = , ... 2! 3! n!

Written out explicitly or in summation notation, we have: M (x) = f (0) + f ′ (0)x + f (i) (0) i x. =∑ i! i=0 ∞

f ′′ (0) 2 f (3) (0) 3 f (4) (0) 4 f (n) (0) n x + x + x + ⋅⋅⋅ + x + .... 2! 3! 4! n!

We are often interested in finite-order Maclaurin series: Definition 103. Let f be a n-times differentiable function. The nth-order Maclaurin series of f at x is denoted Mn (x) and is defined as the nth-degree polynomial (or finite series) Mn (x) = a0 + a1 x + a2 x2 + ⋅ ⋅ ⋅ + an xn ,

where a0 = f (0), a1 = f ′ (0), a2 =

Page 450, Table of Contents

f ′′ (0) f (3) (0) f (n) (0) , a3 = , ..., an = . 2! 3! n!

www.EconsPhDTutor.com

Example 435. Let’s compute the Maclaurin series for f ∶ R → R defined by x ↦ ex . We have f (0) = e0 = 1. We also have f ′ (x) = ex , so that f ′ (0) = e0 = 1. Similarly, f ′′ (x) = ex , so that f ′′ (0) = e0 = 1. Indeed, for any k ∈ Z+ , f (k) (x) = ex , so that f (k) (0) = e0 = 1. Hence, the Maclaurin series for f is M (x) = 1 + 1x +

x2 x2 1 2 1 3 x + x + ⋅⋅⋅ = 1 + x + + + ... 2! 3! 2! 3!

Here are the first four finite-order Maclaurin series for f : M0 (x) = 1,

M1 (x) = 1 + x,

x2 M2 (x) = 1 + x + , 2!

x2 x3 M3 (x) = 1 + x + + . 2! 3!

Exercise 177. Write down the third-order Maclaurin series for each of the following functions: (Answer on p. 1056.) (a) f ∶ R → R defined by x ↦ (1 + x)n , (b) g ∶ R → R defined by x ↦ sin x,

(c) h ∶ R → R defined by x ↦ cos x,

(d) i ∶ R → R defined by x ↦ ln(1 + x). Remark 8. The A-level syllabuses make no mention of the Taylor series and so we won’t talk about it. But just so you know, the Maclaurin series is simply a special case of the Taylor series — specifically, it is the Taylor series about 0.

Page 451, Table of Contents

www.EconsPhDTutor.com

46.3

The Amazing Maclaurin Series

The Maclaurin series is simply an (infinite) series. And as we saw in Part II, an infinite series may or may not be convergent. The following is a very powerful theorem: “Informal Theorem”. If f satisfies a “nice” property at a, then M (a) converges to f (a). That is, M (a) = f (a). For what exactly this mysterious “nice” property is, see section 84.14 (optional).

The following table is in the List of Formulae you get, so dun need to memorise.

f (x)

=

f (0)

+

xf ′ (0)

+

x2 ′′ f (0) 2!

+...

xn (n) f (0) n!

+...

(1 + x)n

=

1

+

nx

+

n(n − 1) 2 x 2!

+...

n(n − 1) . . . (n − r + 1) r x r!

+...

(∣x∣ < 1)

ex

=

1

+

x

+

x2 2!

+...

xr r!

+...

(all x)

sin x

=

x

−

x3 3!

+

x5 5!

+...

(−1)r x2r+1 (2r + 1)!

+...

(all x)

cos x

=

1

−

x2 2!

+

x4 4!

+...

(−1)r x2r (2r)!

+...

(all x)

ln(1 + x)

=

x

−

x2 2

+

x3 3

+...

(−1)r+1 xr r

+...

(−1 < x ≤ 1)

1. The first row of the above table says that if x is a value at which the function f satisfies the “nice” property, then f (x) is equal to the Maclaurin series of f at x. 2. The second row says that g ∶ R → R by x ↦ (1 + x)n satisfies the “nice” property for all n(n − 1) 2 x ∈ (−1, 1). Thus, for all x ∈ (−1, 1), we have (1 + x)n = 1 + nx + x + . . . We 2! say that (−1, 1) is the range of values for which g has a convergent Maclaurin series.49 49

We should be careful to state that if n < 0, then the domain should be restricted to exclude 0.

Page 452, Table of Contents

www.EconsPhDTutor.com

3. The third row says that h ∶ R → R by x ↦ ex satisfies the “nice” property for all x ∈ R. x2 Thus, for all x ∈ R, we have ex = 1 + x + + . . . We say that R is the range of values 2! for which h has a convergent Maclaurin series. 4. The fourth row says that i ∶ R → R by x ↦ sin x satisfies the “nice” property for all x ∈ R. x3 x5 Thus, for all x ∈ R, we have sin x = x − + . . . We say that R is the range of values 3! 5! for which i has a convergent Maclaurin series. 5. The fifth row says that j ∶ R → R by x ↦ cos x satisfies the “nice” property for all x ∈ R. x2 x4 Thus, for all x ∈ R, we have cos x = 1 − + . . . We say that R is the range of values 2! 4! for which j has a convergent Maclaurin series.

6. The sixth row says that k ∶ (−1, ∞) → R by x ↦ ln(1 + x) satisfies the “nice” property for x2 x3 all x ∈ (−1, 1]. Thus, for all x ∈ (−1, 1], we have ln(1 + x) = x − + − . . . We say that 2 3 (−1, 1] is the range of values for which k has a convergent Maclaurin series. In the syllabus, these five particular Maclaurin series are called standard series.

x2 x3 Example 436. Here we will not rigorously prove that e = 1 + x + + + . . . for all x ∈ R. 2! 3! Instead, we will merely verify that this equation is “plausible”, for x = 0, 1, 5. (Try these out yourself using the sheet “Maclaurin series” at the usual link.) x

For x = 0, we have ex = e0 = 1. And M0 (0) = 1, M1 (0) = 1, M2 (0) = 1, and indeed Mn (0) = 1 for all n. So it does appear “plausible” that e0 = M (0).

1 For x = 1, we have ex = e1 ≈ 2.718. And M0 (1) = 1, M1 (1) = 1 + 1 = 2, M2 (1) = 1 + 1 + = 2.5, 2 ⋅ 1 1 M3 (1) = 1 + 1 + + = 2.67, ..., M7 (1) ≈ 2.718. It appears that Mn (1) ≈ 2.718 for all n ≥ 7. 2 6 So it does appear “plausible” that e1 = M (1).

25 For x = 5, we have ex = e5 ≈ 148.413. And M0 (5) = 1, M1 (5) = 1 + 5 = 6, M2 (5) = 1 + 5 + = 2 25 125 1 18.5, M3 (1) = 1 + 5 + + = 39 , ..., M18 (5) ≈ 148.413. It appears that Mn (5) ≈ 148.413 2 6 3 for all n ≥ 18. So it does appear “plausible” that e5 = M (5).

Exercise 178. (Tedious, use the sheet named “Maclaurin series” at the usual link.) Verify π x3 x5 that for x = 0, , 2π, it is similarly “plausible” that sin x = x − + . . . (Answer on p. 2 3! 5! 1057.) Page 453, Table of Contents

www.EconsPhDTutor.com

46.4

Finite-Order Maclaurin Series as Approximations

One important practical use of Maclaurin series is that finite-order Maclaurin series can be used as approximations. Example 437. Consider h ∶ R → R defined by x ↦ ex . We have h(1) = e ≈ 2.718.

The 0th-order Maclaurin series is a pretty terrible approximation: M0 (1) = 1. The 1st-order Maclaurin series is slightly better: M1 (1) = 1 + x = 1 + 1 = 2. The 2nd-order Maclaurin series is even better: M2 (1) = 1 + x + 0.5x2 = 1 + 1 + 0.5 = 2.5. The 3rd-order Maclaurin series is ⋅ ⋅ very good: M3 (1) = 1 + x + 0.5x2 + x3 /3! = 1 + 1 + 0.5 + 0.16 = 2.66.

We see that it tends to be that the higher the order of the Maclaurin series, the better the approximation. I emphasise the phrase tends to be, because the approximation can sometimes get worse before it gets better, especially if we’re looking at a value that is far from 0. The next example illustrates.

Page 454, Table of Contents

www.EconsPhDTutor.com

Example 438. Consider i ∶ R → R defined by x ↦ sin x. We have h(2π) = sin(2π) = 0. If we do the tedious computations, we find that M0 (2π) = 0,

M3 (2π) = M4 (2π) ≈ −35.059,

M1 (2π) = M2 (2π) = 2π ≈ 6.283,

M5 (2π) = M6 (2π) ≈ 46.547.

The 0th-order Maclaurin series gets it exactly right. But each subsequent finite-order Maclaurin series then drifts ever further from 0! Having computed the 5th- and 6th-order Maclaurin series, it certainly does not look like the approximations will get any better. Yet if we persevere, we find that M7 (2π) = M8 (2π) ≈ −30.159,

M13 (2π) =M14 (2π) ≈ 0.625,

M19 (2π) =M20 (2π) ≈ −0.001,

M9 (2π) =M10 (2π) ≈ 11.900,

M11 (2π) =M12 (2π) ≈ −3.195,

M15 (2π) =M16 (2π) ≈ −0.093, M17 (2π) =M18 (2π) ≈ 0.011, M21 (2π) =M22 (2π) ≈ 0.000,

...

Indeed, Mn (2π) ≈ 0.000 for all n ≥ 21. So it does indeed look like the Maclaurin series for sin x converges. Graphed below are sin x and M21 (x). We see that M21 (x) almost perfectly approximates sin x for x ∈ [−7, 7]. But for larger values, M21 (x) veers far away from sin x.

y

x -12

-7

-2

2

7

12

(... Example continued on the next page ...) Page 455, Table of Contents

www.EconsPhDTutor.com

(... Example continued from the previous page ...) Graphed below are y = sin x, M1 (x), . . . , M10 (x). We see that the 1st-order Maclaurin series M1 (x) = x is indeed a good approximation for values of x that are close to 0, but terrible for larger values. Low-order Maclaurin series work well as approximations, provided we are looking at small values of x (i.e. values that are close to 0).

But for large x, even if the Maclaurin series eventually converges, low-order Maclaurin series may fare very poorly as approximations. Indeed, as we saw on the previous page, for sufficiently large values of x, even a relatively-high-order Maclaurin series like M21 (x) will fare poorly as an approximation!

y

x

Page 456, Table of Contents

www.EconsPhDTutor.com

If a is not within the range of values for which the Maclaurin series for the function f converges, then M (a) ≠ f (a). That is, the (infinite) Maclaurin series does not converge. Hence, there is no reason to expect that Mi (a) ≈ f (a) (i.e. that any finite-order Maclaurin series will serve as a good approximation). Example to illustrate. Example 439. Consider k ∶ R → R defined by x ↦ ln(1 + x). The range of values for which the Maclaurin series converges is (−1, −1]. Suppose we pick x = 2, which is certainly outside this range. Then we have k(2) = ln 3 ≈ 1.099. Let’s see what the finite-order Maclaurin series look like: M0 (2) = 0,

M1 (2) = 2,

M8 (2) = −19.314,

M9 (2) = 37.575,

1 M4 (2) = −1 , 3

M5 (2) = 5

1 , 15

M2 (2) = 0,

M6 (2) = −5.6,

M10 (2) = −64.825,

2 M3 (2) = 2 , 3

M7 (2) ≈ 12.686,

M11 (2) ≈ 121.356.

Unlike before, further perseverance will not pay off here. Indeed, the Maclaurin series will grow without bound. For example, M50 (2) ≈ −14.9 trillion! The Maclaurin series simply does not converge for x = 2. So there is no reason to expect any finite-order Maclaurin series to be a good approximation.

Page 457, Table of Contents

www.EconsPhDTutor.com

46.5

Product of Two Power Series

Informally, if two power series converge, then so too does their product; and to get this product, simply multiply the two series together as if they were finite polynomials.50 1 1 3 x + . . . and cos x = 1 + 0 − x2 + 0 + . . . . 3! 2! Thus, for all x ∈ R, we have sin x cos x = c0 + c1 x + c2 x2 + c3 x3 + . . . , where Example 440. For all x ∈ R, sin x = 0 + 1x + 0 − Constant Term ∶

Coefficient on x ∶

Coefficient on x2 ∶

Coefficient on x3 ∶

c0 = 0 × 1 = 0,

c1 = 0 × 0 + 1 × 1 = 1,

1 c2 = 0 × (− ) + 1 × 0 + 0 × 1 = 0, 2!

1 2 1 c3 = 0 × 0 + 1 × (− ) + 0 × 0 + (− ) × 1 = − 2! 3! 3 ⋮

Take a moment to convince yourself that c0 , c1 , c2 , c3 are as stated. So for all x ∈ R, 2 2 sin x cos x = 0 + 1x + 0x2 + (− ) x3 + ⋅ ⋅ ⋅ = x − x3 + . . . 3 3

The expression on the RHS is, of course, simply also the Maclaurin series for sin x cos x. You are asked to show this in Exercise 179. Exercise 179. Let f ∶ R → R be defined by x ↦ sin x cos x. Evaluate f (0), f ′ (0), f ′′ (0), and f (3) (0). Hence, write down the 3rd-order Maclaurin series for f and verify that this is consistent with what we found in Example 440. (Answer on p. 1058.) The next example illustrates that one must be careful about when the Maclaurin series is convergent:

50

This assertion is formally stated and proven at Fact 97 in the Appendices (optional).

Page 458, Table of Contents

www.EconsPhDTutor.com

1 Example 441. For all x ∈ R, we have sin x = 0 + 1x + 0 − x3 + . . . For all x ∈ (−1, 1], 3! 1 2 1 3 we have ln(1 + x) = 1x − x + x + . . . And so for x ∈ (−1, 1], we have sin x ln(1 + x) = 2 3 c0 + c1 x + c2 x2 + c3 x3 + . . . , where Constant Term ∶

Coefficient on x ∶

Coefficient on x2 ∶

Coefficient on x3 ∶

c0 = 0 × 0 = 0,

c1 = 0 × 1 + 1 × 0 = 0,

1 c2 = 0 × (− ) + 1 × 1 + 0 × 0 = 1, 2

c3 = 0 × ⋮

1 1 1 1 + 1 × (− ) + 0 × 1 + (− ) × 0 = − 3 2 3! 2

1 1 And so sin x ln(1 + x) = 0 + 0x + 1x2 + (− ) x3 + ⋅ ⋅ ⋅ = x2 − x3 + . . . , for x ∈ (−1, 1] — this set 2 2 is simply the intersection of R and (−1, 1], which are respectively the ranges of values on which the Maclaurin series for sin x and ln x converge.

The expression on the RHS is, of course, simply also the Maclaurin series for sin x ln(1 + x). You are asked to show this in Exercise 179. Exercise 180. Let f ∶ R → R be defined by x ↦ sin x ln(1 + x). Evaluate f (0), f ′ (0), f ′′ (0), and f (3) (0). Hence, write down the 3rd-order Maclaurin series for f and verify that this is consistent with what we found in Example 441. (Answer on p. 1058.)

Page 459, Table of Contents

www.EconsPhDTutor.com

46.6

Composition of Two Functions

Informally, if f (x) = a0 + a1 x + a2 x2 + . . . for x ∈ S and g(c) ∈ S, then f (g(c)) = a0 + a1 g(c) + a2 [g(c)] + . . . 2

That is, to get f (g(c)), simply “plug in” g(c) into the power series for f .51

Example 442. Define f ∶ R− ∪ R+ → R by f (x) = (1 + x)−1 and g ∶ R → R by g(x) = 2x. We know that for all x ∈ (−1, 1), we have f (x) = (1 + x)−1 = 1 − x + x2 − x3 + . . . . Thus, f (g(x)) = (1 + 2x)−1 = 1 − (2x) + (2x)2 − (2x)3 + . . . for all g(x) = 2x ∈ (−1, 1). Equivalently, f (g(x)) = (1 + 2x)−1 = 1 − 2x + 4x2 − 8x3 + . . . for all x ∈ (−0.5, 0.5). Example 443. Define f ∶ R → R by f (x) = ex and g ∶ R → R by g(x) = x2 . We know that x2 x3 x + + .... for all x ∈ R, we have f (x) = e = 1 + x + 2! 3! (x2 ) (x2 ) Thus, f (g(x)) = e = 1 + x + + + . . . for all g(x) ∈ R. Equivalently, 2! 3! x2

2

2

3

f (g(x)) = ex = 1 + x2 + 2

x4 x6 + + . . . for all x ∈ R. 2! 3!

In the case where g also has a convergent Maclaurin series, we can likewise also simply “plug in” the Maclaurin series for g.52 Example:

51 52

For a more careful and formal version of this assertion, see Fact 98 in the Appendices (optional). Again, for a more careful and formal version of this assertion, see Fact 99 in the Appendices (optional).

Page 460, Table of Contents

www.EconsPhDTutor.com

Example 444. Define f ∶ (−∞, −1) ∪ (−1, ∞) → R by f (x) = 1/(1 + x) and g ∶ R → R by g(x) = sin x. Write down the Maclaurin series for f ○ g, up to the 4th-order term.

Method #1 (composition method). We know that for all x ∈ (−1, 1), we have f (x) = x3 1−x+x2 −x3 +. . . . And for all x ∈ R, we have g(x) = x− +. . . Hence, for all g(x) ∈ (−1, 1), 3! i.e. for all x ≠ kπ/2 (for k ∈ Z), we have 1 2 3 = 1 − g(x) + [g(x)] − [g(x)] + . . . 1 + sin x 2 3 x3 x3 x3 + . . . ) + (x − + . . . ) − (x − + ...) ... = 1 − (x − 3! 3! 3! 1 5 = 1 − x + x2 + x3 ( − 1) + ⋅ ⋅ ⋅ = 1 − x + x2 − x3 + . . . 3! 6

f (g(x)) =

Find the general term for a Maclaurin series is explicitly excluded from the A-level syllabuses. Usually you’ll just have to write down the first few terms. Method #2 (direct method). Let h(x) = 1/(1 + sin x). We have h(0) = 1. We also have R R − cos x RRRR dh RRRR = −1, R = 2 RRR dx RRRR R (1 + sin x) RRx=0 Rx=0

R R R 2 d2 h RRRR (1 + sin x) sin x + 2 cos2 x(1 + sin x) RRRR (1 + sin x) sin x + 2 cos2 x RRRR R = RRR = RRR 4 3 dx2 RRRR R RRR (1 + sin x) (1 + sin x) R Rx=0 Rx=0 x=0 R sin x + 2 − sin2 x RRRR = RR = 2, 3 (1 + sin x) RRRRx=0

3 2 R (1 + sin x) (cos x − 2 sin x cos x) − (sin x + 2 − sin2 x) 3 (1 + sin x) cos x RRRR d3 h RRRR RRR R = 6 RRR dx3 RRRR (1 + sin x) Rx=0 Rx=0 1−2⋅3⋅1 = = −5. 1

Thus,

1 2 −5 5 = 1 + (−1)x + x2 + x3 + ⋅ ⋅ ⋅ = 1 − x + x2 − x3 + . . . 1 + sin x 2! 3! 6

In the above example, I gave two methods. Use whichever seems to be easier or quicker. Here’s another example: Page 461, Table of Contents

www.EconsPhDTutor.com

Example 445. Write down the Maclaurin series for sec x up to the 4th-order term. Method #1 (composition method). x2 x4 1 1 = [1 − ( − sec x = = + . . . )] cos x 1 − x2!2 + x4!4 − . . . 2! 4!

−1

2

x2 x4 x2 x4 =1+( − + ...) + ( − + ...) + ... 2! 4! 2! 4! =1+

1 1 2 x2 5x4 x2 + x4 [− + ( ) ] + ⋅ ⋅ ⋅ = 1 + + + ... 2! 4! 2! 2 24

Method #2 (direct method). Let f (x) = sec x. Then f (0) = sec 0 = 1. And f ′ (x) = sec x tan x

Ô⇒ f ′ (0) = 0,

f ′′ (x) = sec x tan2 x + sec3 x f (3) (x) = 6 sec2 xf ′ (x) − f ′ (x)

Ô⇒ f ′′ (0) = 1, Ô⇒ f (3) (0) = 0,

f (4) (x) = 12 sec x [f ′ (x)] + 6 sec2 xf ′′ (x) − f ′′ (x) 2

Thus,

Ô⇒ f (4) (0) = 5.

1 2 0 3 5 4 x2 5x4 + + ... sec x = 1 + 0x + x + x + x + ⋅ ⋅ ⋅ = 1 + 2! 3! 4! 2! 24

Exercise 181. Write down the third-order Maclaurin series for sin [ln(1 + x)]. State also the range of values for which the Maclaurin series converges. (Answer on p. 1058.)

Page 462, Table of Contents

www.EconsPhDTutor.com

46.7

How the Maclaurin Series Works (Optional)

Theorem 8. Let f ∶ [−c, c] → R be an infinitely-differentiable function. Suppose that for all x ∈ (−c, c), we have f (x) = a0 + a1 x + a2 x2 + a3 x3 + a4 x4 . . .

Then the coefficients in the above power series are as given by the Maclaurin series. That is, for each i = 0, 1, 2, . . . , we have ai =

f (i) (0) . i!

Proof. Observe that f ′ (x) = a1 + 2a2 x + 3a3 x2 + 4a4 x3 . . . ,

f ′′ (x) = 2a2 + (3 × 2)a3 x + (4 × 3)a4 x2 . . . ,

f (3) (x) = (3 × 2)a3 + (4 × 3 × 2)a4 x . . . , f (4) (x) = (4 × 3 × 2)a4 + . . .

Thus, f (0) = a0 and

f ′ (0) = a1 ,

f ′′ (0) = 2!a2 ,

f (3) (x) = 3!a3 , f (4) (x) = 4!a4 .

Rearranging, we have a0 = f (0), a1 = f ′ (0), a2 = f ′′ (0)/2!, a3 = f (3) (0)/3!, a4 = f (4) (0)/4!, ..., ai = f (i) (0)/i!, ..., as desired.

The above theorem is merely a tantalising hint of why the Maclaurin series “works”. This is because the theorem merely says this: If we make the very big assumption that the infinitelydifferentiable function f can be written down as a power series, then the coefficients of the power series are as given by the Maclaurin series. But this is not very useful, because — how do we know that the function can be written down as a power series? For a continuation of this discussion, see section 84.14 in the Appendices.

Page 463, Table of Contents

www.EconsPhDTutor.com

47

The Indefinite Integral

Definition 104. Given functions f and F , we call F an indefinite integral (or antiderivative or primitive) of f if for all x in the domain of f ,

In Leibniz’s notation, we may write

F ′ (x) = f (x),

F = ∫ f (x) dx or more simply F = ∫ f dx. The statement “F = ∫ f dx” is thus completely equivalent to the statement “F ′ = f ”.

Example 446. Consider the functions f, F ∶ R → R defined by f (x) = 2x and F (x) = x2 .

We see that F is an indefinite integral of f , because F ′ (x) = 2x = f (x) for all x. We can equivalently say that f is the derivative of F . We can also write F = ∫ f dx or

dF = f. dx

The statement “the value of F at 5 is 25” can be written as F (5) = 25, or ∫ f dx∣x=5 = 25 or

∫ f (x) dx∣x=5 = 25.

• The symbol ∫ is called the integration sign — it is an elongated S.

• The symbol dx is called the differential of the variable x — it informs us that the variable of integration is x. • The function f to be integrated is called the integrand. Just like with summation, x is a “dummy” variable. We can replace x with any other letter and the function F will still remain exactly the same function.

Page 464, Table of Contents

www.EconsPhDTutor.com

Example 447. The following two expressions are equal because i on the LHS and r on the RHS are simply “dummy” variables.

n

n

i=1

r=1

∑ i = ∑ r. Similarly, the statement F = ∫ f (x) dx is equivalent any of the following three statements, because the letters x, a, b, c, etc. are merely “dummy” variables: F = ∫ f (a) da,

or F = ∫ f (b) db,

or F = ∫ f (c) dc.

So the statement “the value of F at 5 is 25” can also be written F (5) = 25 or any of the following four statements: ∫ f (x) dx∣x=5 = 25, ∫ f (a) da∣a=5 = 25, ∫ f (b) db∣b=5 = 25,

Page 465, Table of Contents

∫ f (c) dc∣c=5 = 25.

www.EconsPhDTutor.com

47.1

The Constant of Integration

Example 448. Consider f ∶ R → R defined by f (x) = sin x.

F ∶ R → R defined by F (x) = − cos x is an indefinite integral of f , because F ′ (x) = sin x = f (x) for all x ∈ R. Are there any other indefinite integrals of f ? Yes, certainly.

For example, G ∶ R → R defined by G(x) = − cos x + 200 is also an indefinite integral of f , because G′ (x) = sin x = f (x) for all x ∈ R.

Indeed, any H ∶ R → R defined by H(x) = − cos x + C where C ∈ R is also an indefinite integral of f , because H ′ (x) = sin x = f (x) for all x ∈ R. In general: Fact 59. If F is an indefinite integral of f , then so too is G defined by G(x) = F (x) + C, for any C ∈ R.

We call C the constant of integration.

Proof. Since F ′ (x) = f (x) for all x, we also have G′ (x) = F ′ (x) + C ′ = F ′ (x) + 0 = f (x) for all x. And so by definition, G is also an indefinite integral of f .

Page 466, Table of Contents

www.EconsPhDTutor.com

47.2

The Indefinite Integral is Unique Up to the C.O.I.

The indefinite integral is unique up to the constant of integration. That is, if F and G are both indefinite integrals of f , then it must be that F and G differ only by a constant. Example 449. Say f has indefinite integral F defined by F (x) = sin (ex −3x+5 ). Suppose G is another indefinite integral of f . Then it must be that F (x) = G(x) + C, for some C ∈ R. 2

Formally: Fact 60. If F and G are both indefinite integrals of f , then there exists some C ∈ R such that F (x) = G(x) + C for all x.

Proof. Since F and G are both indefinite integrals of f , by definition, F ′ (x) = G′ (x) for all x. And thus (F − G)′ (x) = 0 for all x. But the only functions whose derivative is always 0 are constant functions.53 Thus, F (x) − G(x) = C, for all x, for some C ∈ R. Exercise 182. (Answer on p. 1059.) Let f ∶ R → R, F ∶ R → R, G ∶ R → R be defined by f (x) = 4 sin 4x, F (x) = − cos 4x, and G(x) = 8 sin2 x cos2 x . (a) Show that F and G are both indefinite integrals of f .

(b) F and G seem to be very different functions. Yet both are indefinite integrals of f . Why does this not contradict our assertion that “the indefinite integral is unique up to a constant”?

53

The alert reader will note that this assertion has not actually been proven in this textbook. We’ll simply take it for granted that “the only functions whose derivative is 0 are constant functions”.

Page 467, Table of Contents

www.EconsPhDTutor.com

48

Integration Techniques

As before with our notation for differentiation, let’s be clear (pedantic). To take an example, the notation ∫ sin x dx = − cos x + C

is simply shorthand54 for the following long-winded statement: Consider a function with mapping rule x ↦ sin x. Its indefinite integrals are functions, all of which have the mapping rule x ↦ − cos x + C.

54

This shorthand statement fails to mention the domain and codomains of the function and its indefinite integral. However, the careful writer will “of course” have specified these nearby.

Page 468, Table of Contents

www.EconsPhDTutor.com

48.1

Basic Rules of Integration

Proposition 10. Let k, n ∈ R be constants with n ≠ −1. Let f and g be functions with indefinite integrals F and G. Then ∫ k dx

∫ x dx n

=

kx + C,

xn+1 + C, = n+1

(x ≠ 0 if n < 0)

−1 ∫ x dx = ln ∣x∣ + C, x ∫ e dx

=

(x ≠ 0)

ex + C,

=

∫ sin x dx

∫ cos x dx

=

∫ kf (x) dx

=

− cos x + C, − sin x + C,

∫ f (x) ± g(x) dx = F (x) ± G(x) + C,

where in each case, C is the constant of integration.

kF (x) + C,

Proof. In general, to prove that ∫ f (x) dx = F , it suffices to prove that F ′ (x) = f (x) for all x.

d And so to prove that ∫ x−1 dx = ln ∣x∣ + C, it suffices to prove that (ln ∣x∣ + C) = x−1 for dx all x ≠ 0. This we now do. First note that ⎧ ⎪ ⎪ ⎪ln x + C, ln ∣x∣ + C = ⎨ ⎪ ⎪ ⎪ ⎩ln (−x) + C,

Thus,

And so indeed

for x > 0,

for x < 0.

⎧ 1 ⎪ ⎪ , ⎪ ⎪ ⎪ x ⎪ d ⎪ (ln ∣x∣ + C) = ⎨ ⎪ dx ⎪ ⎪ −1 1 ⎪ ⎪ ⎪ = , ⎪ ⎩ −x x

d (ln ∣x∣ + C) = x−1 for all x ≠ 0. dx

for x > 0,

for x < 0.

You are asked to prove the remaining rules of integration in Exercise 183.

Exercise 183. Prove the remaining rules of integration listed in Proposition 10. (Answer on p. 1060.)

Page 469, Table of Contents

www.EconsPhDTutor.com

48.2

More Basic Rules of Integration

No need to memorise the following rules of integration, because the List of Formulae contains a (slightly less general) version. Proposition 11. Let a ≠ 0. Then (a)

(b) (c) (d) (e) (f) (g) (h)

1 ∫ x2 + a2 dx

=

1 x tan−1 ( ) + C, a a

1 ∫ x2 − a2 dx

=

1 x−a ln ∣ ∣ + C, 2a x+a

∫ tan x dx

=

ln ∣sec x∣ + C,

∫ csc x dx

= − ln ∣csc x + cot x∣ + C,

1 dx = ∫ √ 2 a − x2 1 ∫ a2 − x2 dx ∫ cot x dx

∫ sec x dx

=

=

=

x sin−1 ( ) + C, a

for ∣x∣ < a,

1 a+x ln ∣ ∣ + C, 2a a−x

for x ≠ a,

ln ∣sin x∣ + C,

ln ∣sec x + tan x∣ + C,

for x ≠ a,

for x not an odd multiple of

π , 2

for x not an multiple of π, for x not an multiple of π, for x not an odd multiple of

π , 2

where in each case, C is the constant of integration. Proof. We prove only (a), (c), and (e). (You are asked to prove the remaining rules of integration in Exercise 184.) d 1 d 1 x 1 1 1 ⋅ tan−1 x = 2 . Hence, [ tan−1 ( ) + C] = = dx x +1 dx a a a ( x )2 + 1 a a a 1 1 x . So indeed ∫ 2 dx = tan−1 ( ) + C. 2 2 2 x +a x +a a a

(a) By Corollary 2,

(... Proof continued on the next page ...)

Page 470, Table of Contents

www.EconsPhDTutor.com

(... Proof continued from the previous page ...) (c) Let x ≠ a. Case #1:

x−a ≥ 0. x+a

d 1 x−a d 1 x−a 1 d ( ln ∣ ∣ + C) = ( ln + C) = [ln(x − a) − ln(x + a)] dx 2a x+a dx 2a x + a 2a dx 1 1 1 1 x + a − (x − a) 1 2a 1 = ( − )= = = , 2a x − a x + a 2a (x − a) (x + a) 2a x2 − a2 x2 − a2

1 1 x−a dx = ln ∣ ∣ + C. so that indeed ∫ 2 x − a2 2a x+a x−a Case #2: < 0. x+a

d 1 x−a d 1 a−x 1 d ( ln ∣ ∣ + C) = ( ln + C) = [ln(a − x) − ln(x + a)] dx 2a x+a dx 2a x + a 2a dx 1 −1 1 1 1 1 1 = ( − )= ( − )= 2 , 2a a − x x + a 2a x − a x + a x − a2

1 1 x−a dx = so that again ∫ 2 ln ∣ ∣ + C. x − a2 2a x+a

(e) Let x not be an odd integer multiple of π/2, so that sec x ≠ 0. Case #1: sec x ≥ 0.

d d sec x tan x (ln ∣sec x∣ + C) = (ln sec x + C) = = tan x, dx dx sec x

so that indeed ∫ tan x dx = ln ∣sec x∣ + C. Case #2: sec x < 0.

d d − sec x tan x (ln ∣sec x∣ + C) = [ln (− sec x) + C] = = tan x, dx dx − sec x

so that again ∫ tan x dx = ln ∣sec x∣ + C.

Exercise 184. Prove the remaining rules of integration listed in Proposition 11. (Answers on pp. 1061, 1062, 1063, and 1064.)

Page 471, Table of Contents

www.EconsPhDTutor.com

48.3

Trigonometric Functions

The following indefinite integrals are NOT on the List of Formulae and you are definitely required to know how to derive them on your own! Fact 61. Let m, n ∈ R. Then (a)

(b) (c) (d) (e) (f)

=

2 ∫ sin x dx

=

2 ∫ cos x dx

=

2 ∫ tan x dx

1 sin 2x x− + C, 2 4

sin 2x 1 x+ + C, 2 4 tan x + x + C,

1 cos [(m − n)x] cos [(m + n)x] + } + C, ∫ sin(mx) cos(nx) dx = − 2 { m−n m+n ∫ sin(mx) sin(nx) dx =

∫ cos(mx) cos(nx) dx =

where C is the constant of integration.

1 sin [(m − n)x] sin [(m + n)x] { − } + C, 2 m−n m+n 1 sin [(m − n)x] sin [(m + n)x] { + } + C, 2 m−n m+n

Proof. (a) The trick is to recall the trigonometric identity cos 2x = 1 − 2 sin2 x (this is in the List of Formulae, as are several other trig identities). And so: 2 ∫ sin x dx = ∫

1 − cos 2x 1 sin 2x dx = x − + C. 2 2 4

You are asked to prove the remaining rules of integration in Exercise 185.

Exercise 185. Prove the remaining rules of integration listed in Fact 61. (Answer on p. 1065.)

Page 472, Table of Contents

www.EconsPhDTutor.com

48.4

Integration by Substitution (IBS)

The method of integration by substitution (IBS) is the Chain Rule in reverse. Before we explain why it works, here are two examples of how it works.55 Example 450. Find ∫ cot x dx.

cos x d . Next, observe that sin x = cos x. Let u = sin x (this is sin x dx du our substitution), so that we also have = cos x. So: dx cos x 1 du ∫ cot x dx = ∫ sin x dx = ∫ u dx dx.

First, observe that cot x =

So far, nothing unusual has happened. Now we’re going to do something strange, which is to take that last expression and merrily cancel out the dx’s: 1 1 du dx = ∫ u du + C1 . ∫ u dx

du Didn’t we repeatedly insist earlier that the derivative is NOT a fraction? So why are we dx allowed to “merrily” cancel out the dx’s!? Shortly we’ll explain why this move is legitimate. For now, let us blindly persevere: 1 ∫ u du + C1 = ln ∣u∣ + C = ln ∣sin x∣ + C. Another example, before we explain why exactly we can “merrily” cancel out the dx’s: du Example 451. Let’s find ∫ 2x cos x2 dx. Let u = x2 , so that we also have = 2x. Now, dx du 2 ∫ 2x cos x dx = ∫ dx cos u dx.

Again, we merrily cancel out the dx’s and write:

du 2 ∫ dx cos u dx = ∫ cos u du + C1 = sin u + C = sin x + C. 55

Actually we secretly already used this method a few times above, though not very explicitly.

Page 473, Table of Contents

www.EconsPhDTutor.com

We now explain why it is OK to “merrily cancel out the dx’s”. In fact, saying that we “merrily cancel out the dx’s” is merely a mnemonic (memory device). We are not actually “cancelling out the dx’s”. Instead, we are appealing to the following result: Theorem 9. Let f ∶ D → R be any continuous function. Let u be a real-valued differentiable function. Assume Range(u) ⊆ D, so that the composite function f ○ u exists. Then du ∫ f ⋅ dx dx = ∫ f du + C.

du dP 1 du dQ 2 Proof. Let P = ∫ f ⋅ dx and Q = ∫ f du. In other words, =f⋅ and = f. dx dx dx du Using first the Chain Rule and then =, we have 2

dQ dQ du 3 du = ⋅ =f⋅ . dx du dx dx

Examining = and =, we see that P and Q are both indefinite integrals for f ⋅

du . And so dx by Fact 60 (uniqueness of the indefinite integral up to a constant), P and Q must be equal (or differ by at most a constant). That is, P = Q + C or 1

3

du ∫ f ⋅ dx dx = ∫ f du + C.

The above result says that when doing integration, we are allowed to “merrily” do two things: du du dx with du (“cancel out the dx’s from dx to get du”); dx dx du dx du 2. Replace du with dx (“multiply du by = 1 to get dx”). dx dx dx

1. Replace

Of course, we are not actually doing any such things as “cancelling out the dx’s” or “muldx tiplying by = 1” — these are merely mnemonics. Instead, all we are doing is appealing dx to the above theorem.56 Let’s try more examples, now that we have a better understanding of how this works: 56

dy dx dy = 1/ . The IFT is true NOT because and dx dy dx dx dy dx are fractions. Nonetheless, as a convenient mnemonic, we can pretend that the IFT holds because and are dy dx dy fractions — even though strictly speaking, such thinking is wrong.

This is analogous to the Inverse Function Theorem, which states that

Page 474, Table of Contents

www.EconsPhDTutor.com

du Example 452. Let’s find ∫ esin x cos x dx. Let u = sin x, so that we also have = cos x. dx Now we can write du 1 sin x cos x dx = ∫ eu dx = ∫ eu du + C1 = eu + C = esin x + C, ∫ e dx where = uses Theorem 9. Purely as a mnemonic, we may think of this step = as “cancelling out the dx’s”, even though strictly speaking, we are doing no such thing; instead, we are appealing to Theorem 9. 1

1

Example 453. Let’s find ∫ (x3 + 5x2 − 3x + 2) (3x2 + 10x − 3) dx. One method would be to fully expand the integrand to get a 152nd-degree polynomial, then integrate this polynomial term-by-term. This is doable, but absurdly tedious. 50

A better method is to observe that 3x2 + 10x − 3 =

x3 + 5x2 − 3x + 2. Then we can write

d (x3 + 5x2 − 3x + 2). Thus, let u = dx

50 3 2 2 50 du ∫ (x + 5x − 3x + 2) (3x + 10x − 3) dx = ∫ u dx dx

u51 (x3 + 5x2 − 3x + 2) = ∫ u du + C1 = +C = 51 51 1

50

51

+ C,

where once again = uses Theorem 9. 1

In the next three examples, we go in the “opposite direction”. That is, instead of “cancelling dx out the dx’s” as was done in the previous few examples, we instead “multiply by = 1”. dx

Page 475, Table of Contents

www.EconsPhDTutor.com

√ Example 454. Let’s find ∫ 1 − u2 du. We’ll use the substitution u = sin x. Note that √ √ du 2 1 − u = 1 − sin2 x = cos x. Moreover, = cos x. So dx √ dx du 1 1 − u2 du = ∫ cos x du = ∫ cos x du = ∫ cos x dx + C1 ∫ dx dx

sin 2x 2 1 + C, = ∫ cos x cos x dx + C1 = ∫ cos2 x dx + C1 = x + 2 4 √ 1 −1 2 sin x cos x sin−1 u + u 1 − u2 = sin u + = , 2 4 2 where = uses Theorem 9 and = uses Fact 61. 1

2

x2 Example 455. Let’s find ∫ √ dx. We’ll use the substitution u3 = 1 + 2x. Note that 3 1 + 2x 1 3 dx 3 2 x = (u − 1) and = u . So 2 du 2 2 2 [ 12 (u3 − 1)] (u3 − 1) (u3 − 1) du x2 √ dx = ∫ dx = ∫ dx = ∫ dx ∫ √ 3 3 4u 4u du 1 + 2x u3 2

(u3 − 1) dx (u3 − 1) 3 2 3u (1 − 2u3 + u6 ) =∫ du + C1 = ∫ ( u ) du + C1 = ∫ du + C1 4u du 4u 2 8 1

=

2

2

3 3 u2 2u5 u8 3 2 1 2u3 u6 4 7 u − 2u + u du + C = ( − + ) + C = u ( − + )+C 1 8∫ 8 2 5 8 8 2 5 8

3 2(1 + 2x) (1 + 2x)2 2/3 1 = (1 + 2x) [ − + ]+C 8 2 5 8

3 20 − 16 − 32x + 5 + 20x + 20x2 3 20x2 − 12x + 9 = (1 + 2x)2/3 + C = (1 + 2x)2/3 + C, 8 40 8 40 where = uses Theorem 9. (The last line is just further simplification, which is nice but not necessary.) 1

Page 476, Table of Contents

www.EconsPhDTutor.com

1 Example 456. Let’s find ∫ dx. We’ll use the substitution u = tan x or x = 1 + 3 cos2 x tan−1 u. Note that cos2 x = So

∫

=∫

1+

1

3 1+u2

dx = ∫

1 1 1 = = sec2 x 1 + tan2 x 1 + u2

1+

1

3 1+u2

and

dx 1 = . du 1 + u2

1 dx 1 du 1 1 dx = ∫ du + C1 = ∫ du + C1 3 3 du 1 + 1+u2 du 1 + 1+u2 1 + u2

1 1 1 2 1 −1 u −1 tan x du + C = du + C = tan ( ) + C = tan ( ) + C, 1 1 ∫ 1 + u2 + 3 22 + u2 2 2 2 2

where = uses Theorem 9 (“multiply by 1

du 2 = 1”) and = uses Proposition 11. du

Usually, the hard part is to figure out the appropriate substitution to make. Fortunately, in the A-level exams, you’ll always be told what substitution to make. Exercise 186. (Answers on pp. 1066 and 1067.) (a) (i) Use the substitution x = 3 sec u 9 √ to find ∫ dx. x2 x2 − 9 √ 9 9 √ (ii) Now use instead the substitution x = to find ∫ dx. 2 x2 − 9 1−u x √ 1 (iii) Show that sin (sec−1 y) = 1 − 2 . Then explain why your answers in (i) and (ii) are y consistent. 3 x3 (b) (i) Use the substitution x = tan u to find ∫ dx. 3/2 2 (4x2 + 9) x3 (ii) Now use instead the substitution u = 4x2 + 9 to find ∫ dx. 3/2 2 (4x + 9) 1 (iii) Show that cos (tan−1 y) = √ . Then explain why your answers in (i) and (ii) are 2 1+y consistent.

Page 477, Table of Contents

www.EconsPhDTutor.com

48.5

Integration by Parts (IBP)

The method of integration by parts (IBP) is the Product Rule in reverse. Theorem 10. (Integration by Parts.) Let u and v be differentiable functions, which have continuous derivatives u′ and v ′ . Then ∫ uv ′ dx = uv − ∫ u′ v dx. Proof. Differentiate the RHS to get u′ v + uv ′ − u′ v = uv ′ . This shows that ∫ uv ′ dx = uv − ∫ u′ v dx, as desired.

To choose v ′ , use the rule of thumb DETAIL — D stands for

dv , Exponential, Trig, dx Algebraic, Inverse trig, Log. (This is because exponential functions are easiest to integrate, followed by trigonometric functions, etc.)

Example 457. Find ∫ xex dx.

By the DETAIL rule of thumb, we should choose v ′ = ex . Now, ′ u v © © x

′ x x x x x ∫ x e dx = uv − ∫ u v dx = xe − ∫ e dx = xe − e = e (x − 1).

Sometimes we need to apply IBP more than once: Example 458. Find ∫ x2 ex dx.

By the DETAIL rule of thumb, we should choose v ′ = ex . Now, u v′

©© 2 x ′ 2 x x ∫ x e dx = uv − ∫ u v dx = x e − ∫ 2xe dx

= x2 ex − 2ex (x − 1) = ex (x2 − 2x + 2). (Use the previous example.) Exercise 187. Find ∫ x sin x dx and ∫ x2 sin x dx. (Answer on p. 1068.) Page 478, Table of Contents

www.EconsPhDTutor.com

49

The Fundamental Theorems of Calculus (FTCs)

The problem of finding the definite integral is the problem of finding the area under a curve. The problem of finding the derivative is the problem of finding the slope of the tangent. The two Fundamental Theorems of Calculus (FTCs) show that, surprisingly enough, these two problems are intimately (indeed inversely) related. This chapter is a largely-informal discussion of the intuition behind the FTCs.

49.1

The Area Function

Given a continuous real-valued function f , its area function is denoted A and is, informally, defined by the mapping A(c) = “Area bounded by the graph of f , the horizontal axis, and the vertical lines x = 0 and x = c”.

Example 459. Graphed below is the continuous function f ∶ R+0 → R defined by f (x) = √ x + 1. The area A(6) is highlighted in red. It is the area bounded by the graph of f , the horizontal axis, and the vertical lines x = 0 and x = 6.

Using a graphing calculator, A(6) = 15.79795897... Is there a way I can figure this out without a graphing calculator? Here’s one possible approach — let’s approximate the area by using three rectangles.

(... Example continued on the next page ... )

Page 479, Table of Contents

www.EconsPhDTutor.com

(... Example continued from the previous page ...) We’ll use three rectangles of equal width — so each rectangle has width 2. The leftmost rectangle will occupy the interval [0, 2], the middle rectangle will occupy [2, 4], and the rightmost rectangle will occupy [4, 6].

For each rectangle, we choose its height to be the lowest value attained by the function in that interval. In the interval [0, 2], the lowest value attained by f is f (0). So the leftmost blue rectangle has height f (0) and thus area Base × Height = 2f (0). Similarly, the middle green rectangle has height f (2), because in the interval [2, 4], the lowest value attained by f is f (2). Hence, it has area Base × Height = 2f (2).

The rightmost grey rectangle has height f (4), because in the interval [4, 6], the lowest value attained by f is f (4). Hence, it has area Base × Height = 2f (4).

y

x -1

0

1

2

3

4

5

6

7

8

9

Altogether, the total area of these three rectangles is √ √ √ SL3 = 2f (0) + 2f (2) + 2f (4) = 2 [( 0 + 1) + ( 2 + 1) + ( 4 + 1)] ≈ 12.828,

where SL3 stands for “Lower Sum in the case of 3 rectangles with equal width”. This is our very first approximation of the area A(6). We see that this is a fairly poor approximation, because the true area is A(6) = 15.79795897... Nonetheless, it is useful — we know that SL3 is a lower bound for A(6). That is, we know that SL3 ≤ A(6). We’ll next try a different approximation — SU 3 . Can you guess what this involves? (... Example continued on the next page ... )

Page 480, Table of Contents

www.EconsPhDTutor.com

(... Example continued from the previous page ... ) We’ll again use three rectangles of equal width (width 2), occupying intervals [0, 2], [2, 4], and [4, 6]. The difference now is that for each rectangle, we choose its height to be the highest value attained by the function in that interval. In the interval [0, 2], the highest value attained by f is f (2). So the leftmost blue rectangle has height f (2) and thus area Base × Height = 2f (2).

Similarly, the middle green rectangle has height f (4), because in the interval [2, 4], the highest value attained by f is f (4). Hence, it has area Base × Height = 2f (4).

The rightmost grey rectangle has height f (4), because in the interval [4, 6], the highest value attained by f is f (6). Hence, it has area Base × Height = 2f (6).

y

x -1

0

1

2

3

4

5

6

7

8

9

Altogether, the total area of these three rectangles is √ √ √ SU 3 = 2f (2) + 2f (4) + 2f (6) = 2 [( 2 + 1) + ( 4 + 1) + ( 6 + 1)] ≈ 17.727,

where SU 3 stands for “Upper Sum in the case of 3 rectangles with equal width”. This is our second approximation of the area A(6). We see that again, this is a fairly poor approximation, because the true area is A(6) = 15.79795897.... Nonetheless, it is again useful — we know that SU 3 is an upper bound for A(6). That is, we know that A(6) ≤ SU 3 . Altogether, we know that 12.828 ≈ SL3 ≤ A(6) ≤ SU 3 ≈ 17.727.

Can we do better than this? Yes, certainly. An obvious follow-up would be to increase the number of rectangles we use. Let’s next use 6 rectangles instead. (... Example continued on the next page ... ) Page 481, Table of Contents

www.EconsPhDTutor.com

(... Example continued from the previous page ... ) We’ll now use six rectangles of equal width (width 1), occupying intervals [0, 1], [1, 2], [2, 3], [3, 4], [4, 5], and [5, 6]. To calculate the Lower Sum SL6 , we give the first rectangle height of f (0), the second f (1), ..., the sixth f (5). So each rectangle has, respectively, area 1×f (0), 1×f (1), ..., and 1×f (5). Hence, SL6 = f (0)+f (1)+f (2)+f (3)+f (4)+f (5) ≈ 14.382.

y

y

x

x

-1 0 1 2 3 4 5 6 7 8 9 -1 0 1 2 3 4 5 6 7 8 9 Analogously, to calculate the Upper Sum SU 6 , we give the first rectangle height of f (1), the second f (2), ..., the sixth f (6). So each rectangle has, respectively, area 1 × f (1), 1 × f (2), ..., and 1 × f (6). Hence, SU 6 = f (1) + f (2) + f (3) + f (4) + f (5) + f (6) ≈ 16.832. Once again, A(6) has lower and upper bounds SL6 and SU 6 . That is, 14.382 ≈ SL6 ≤ A(6) ≤ SU 6 ≈ 16.832.

You can see where this is going. We can get ever better lower and upper bounds, by increasing the number of rectangles we use. (... Example continued on the next page ... ) Page 482, Table of Contents

www.EconsPhDTutor.com

Exercise 188. Continuing with the above example, find SL12 and SU 12 . Hence give lower and upper bounds for A(6). (Answer on p. 1069.)

(... Example continued from the previous page ... ) Let n be the number of rectangles we use. We will always have SLn ≤ A(6) ≤ SU n .

As n increases, we have increasingly-many, increasingly-slim rectangles. As n → ∞, we have infinitely-many, infinitely-slim rectangles, whose total area should approach A(6).

Indeed, this “slim rectangles approach” is exactly how the area function is formally and rigorously defined — see Section 84.17 in the Appendices for the details (optional).

y

x -1

0

1

2

3

4

5

6

7

8

9

It appears then that we need to do more maths to figure out how to add up all these “infinitely-many, infinitely-slim rectangles”. ... But it turns out though that there is an absolutely-fantastic shortcut we can use.

Page 483, Table of Contents

www.EconsPhDTutor.com

49.2

The First Fundamental Theorem of Calculus (FTC1)

Given a function f , we sketched an idea of how to find its area function A — approximate the area under the curve using “infinitely-many, infinitely-slim rectangles” and add up the total area of these rectangles. This though was merely a sketch of an idea. How do we go about adding up the area of these “infinitely-many, infinitely-slim rectangles”? Easier said than done! It turns out though that we’ll take an entirely different approach. Strangely enough, instead of finding the area function A, we shall try to find the area function’s derivative A′ . This seems utterly bizarre. If we don’t know what A is in the first place, how could we possibly figure out what A′ is? This is analogous to asking someone, who has no idea where Singapore is, to find the Singapore Flyer! But surprisingly, it turns out to be much easier to find A′ than it is to find A! We’ll recycle the example from the last section:

Page 484, Table of Contents

www.EconsPhDTutor.com

Example 459 (continued from the previous section). Pick some x that’s just a little larger than 6. A(x) is the area bounded by the graph of f , between the vertical lines x = 0 and x = 6. And so A(x) is just slightly larger than A(6).

y

x -1

0

1

2

3

4

5

6

7

8

9

Consider the thin green vertical strip. This green strip is roughly rectangular in shape — its left, right, and bottom edges are all straight. Only its upper edge is not straight. This green strip’s area is exactly A(x) − A(6). Moreover, we know that its base is x − 6, its left side is f (6), and its right side is f (x). Hence, (... Example continued on the next page ...)

Page 485, Table of Contents

www.EconsPhDTutor.com

(... Example continued from the previous page ...) Area of rectangle with Area of thin green Area of rectangle with base x − 6 and height f (x) vertical strip base x − 6 and height f (6) ³¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ·¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ µ ³¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ·¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ µ ³¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹·¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ µ (x − 6) × f (x) . A(x) − A(6) < (x − 6) × f (6) <

Rearranging, we have

f (6) <

A(x) − A(6) < f (x). x−6

Now consider what happens if we pick another x that is slightly smaller but still larger than 6. Then the above pair of inequalities will still hold. Indeed, for all x > 6, the above pair of inequalities hold. If we let x approach 6, the above pair of inequalities becomes A(x) − A(6) ≤ lim f (x). x→6 x→6 x−6

lim f (6) ≤ lim x→6

(For why the strict inequalities < became weak inequalities ≤, either you simply trust me or see Fact 7 in the Appendices.) Of course, lim f (6) is simply f (6). And by the continuity of f , lim f (x) = f (6). Hence, x→6

x→6

A(x) − A(6) ≤ f (6), x→6 x−6

f (6) ≤ lim

A(x) − A(6) = f (6). But wait a second ... x→6 x−6 A(x) − A(6) lim ? It is simply the value of the derivative of A at 6!! x→6 x−6 which means of course that lim

what is

A(x) − A(6) Def A′ (6) = lim . x→6 x−6

We thus conclude that astonishingly enough, A′ (6) = f (6). And this is more generally true — given a continuous function f , the derivative of its area function is simply the original function itself! This is the First Fundamental Theorem of Calculus.

Page 486, Table of Contents

www.EconsPhDTutor.com

Note that earlier we defined the area function so that we started counting the area from x = 0 (vertical axis). But this was just to keep the above arguments and diagrams simple. It makes no difference if we start counting the area from any other x = a instead. Theorem 11. (First Fundamental Theorem of Calculus [FTC1], informal statement.) Let f be a real-valued continuous function with area function A. Then A′ = f .

In words, the FTC1 says that the area function of a continuous function is simply the function itself ! Equivalently, an indefinite integral (or antiderivative) of a continuous function is the area function.57 Exercise 189. Why did I use the indefinite article an, rather than the definite article the, in the last sentence above? (Answer on p. 1069.)

Next is a familiar example from physics to illustrate the FTC1. Example 460. Graphed below is the velocity v (ms-1 ) of a car as a function of time t (s). Recall that the area under the graph is the distance travelled by the car. For example, the shaded red area A(5) is the total distance travelled by the car after 5 s. But the derivative of the distance travelled with respect to time is precisely the velocity! Hence, this example illustrates the FTC1: the derivative of the area under the graph of a function is precisely the function itself!

Velocity (ms-1)

Time (s) 0

57

1

2

3

4

5

6

7

8

For a formal, rigorous statement of FTC1 and its proof, see section 84.17 in the Appendices.

Page 487, Table of Contents

www.EconsPhDTutor.com

49.3

The Definite (or Riemann) Integral

∫p f (x) dx is the area under under the graph of f , between p and q. (Compare this to the q

area function: A(k) is the area under the graph of f , between 0 and k.) We call ∫

the definite (or Riemann) integral of f between p and q. Example 461. Define f ∶ R+0 → R by f (x) =

58

√

q

p

f (x) dx

3

x + 1. The definite integral ∫ f dx (simply 1 the area under f , between 1 and 3) is highlighted in blue. Similarly, the definite integral 8

∫5 f dx (simply the area under f , between 5 and 8) is highlighted in red.

y x 0

1

2

3

4

5

6

7

8

9

IMPORTANT REMARK q

The indefinite integral ∫ f dx and the definite integral ∫ f dx have very similar p names and notation. But do not make the mistake of believing that we’ve simply defined them so that they’re similar — we have not. • The indefinite integral ∫ f (x) dx is an antiderivative of f . (It is also a function.)

• The definite integral ∫ f (x) dx is the area under the graph of f , between a and b. (It a is also a number.) b

A priori, there is no reason whatsoever to believe that some antiderivative of f and some area under the graph of f have anything in common. It is the two FTCs that establish the connection between the two. This is what makes the FTCs remarkable and surprising. And it is because of this connection that we give these two distinctly-defined mathematical objects such similar names and notation. 58

For the formal definition of the definite integral, see section 84.17 in the Appendices.

Page 488, Table of Contents

www.EconsPhDTutor.com

49.4

The Second Fundamental Theorem of Calculus (FTC2)

Our (informal) definitions tell us that: • A(q) is the area under the curve between 0 and q; • Similarly, A(p) is the area under the curve between 0 and p; q

• And ∫ f dx is the area under the curve between p and q. p

Thus, ∫ f dx = A(q) − A(p). From this and also with the aid of the FTC1, we can easily p prove the FTC2. q

Theorem 12. (The Second Fundamental Theorem of Calculus [FTC2].) Let f ∶ [a, b] → R be a continuous function and p, q ∈ [a, b]. Then ∫p f dx = ∫ f dx∣x=q − ∫ f dx∣x=p . q

Proof. By Theorem 11, the area function A is an indefinite integral of f . And so by Fact 60, A and ∫ f dx differ by at most a constant. That is, for all r ∈ [a, b], we have

A(r) = ∫ f dx∣

x=r

+ C. Hence,

∫p f dx = A(q) − A(p) q

= [∫ f dx∣

x=q

= ∫ f dx∣

x=q

Page 489, Table of Contents

+ C] − [∫ f dx∣

− ∫ f dx∣

x=p

x=p

+ C]

.

www.EconsPhDTutor.com

50

Definite Integrals

To repeat: • The indefinite integral ∫ f dx is an antiderivative of f . b

• The definite integral ∫ f dx is an area under the graph of f . a

A priori, there is no reason to believe that the two are in any way related. It is the two FTCs that establishes their remarkable relationship: ∫a f dx = ∫ f dx∣x=b − ∫ f dx∣x=a , b

b

To compute ∫ f dx, one method would have been to painfully add up the area of the a “infinitely-many infinitely-slim rectangles”. Thanks to the FTCs, we have a wonderful alternative method that is much easier: 1. Find any indefinite integral of f . 2. The difference of the values of this indefinite integral at b and a is our desired area. We can simply apply all the rules of integration we learnt earlier. Notation: We’ll let [f (x)]a be shorthand for f (b) − f (a). b

Page 490, Table of Contents

www.EconsPhDTutor.com

50.1

Area between a Curve and Lines Parallel to Axes

Example 462. Find the exact area bounded by the curve y = x2 and the horizontal lines y = 1 and y = 2.

It’s always helpful to make a quick sketch (given below). Our desired area is labelled A below. To find a desired area, there are usually multiple methods, some quicker than others. √ √ Method #1. The entire rectangle A + B + C + D has area 2 × 2 2 = 4 2. B has area √ √ 3 −1 −1 x 2 2−1 1 2 2 2 . ∫−√2 x dx = [ 3 ] √ = − 3 − (− 3 ) = 3 − 2

By symmetry, D has the same area as B. C has area 1 × 2. Hence, A has area √ √ √ 2 2−1 2 2−1 4 √ A + B + C + D − (B + C + D) = 4 2 − ( +2+ ) = (2 2 − 1) . 3 3 3

√ Method #2. The right branch of the parabola y = x2 has equation x = y. The right half y=2 √ y=2 2 2 2 √ 4 √ x dy = ∫ of the area A is ∫ y dy = [y 3/2 ]1 = (2 2 − 1). Hence, A = (2 2 − 1). 3 3 3 y=1 y=1

y

y=2 A y=1 B

D

C

x -2

-1

0

1

2

Exercise 190. Find the exact area bounded by the curve y = x3 , the horizontal lines y = 1 and y = 2, and the vertical axis. (Answer on p. 1070.) Page 491, Table of Contents

www.EconsPhDTutor.com

50.2

Area between a Curve and a Line

Example 463. Find the area A bounded by the curve y = x2 and the line y = x + 1. √ 1± 5 . By the quadratic formula, the curve and line intersect at the points x = 2 √ (1+ 5)/2

√ (1+ 5)/2 3 x

x2

2 ∫(1−√5)/2 x + 1 − x dx = [ 2 + x − 3 ] √ (1− 5)/2

√ 3⎤ ⎡ √ 3⎤ √ 2 √ √ ⎡ (1 + √5)2 (1 (1 (1 5) 5) 5) ⎥ + − − ⎢ ⎥ ⎢ 1+ 5 1− 5 ⎥−⎢ ⎥ = ⎢⎢ + − + − ⎥ ⎢ ⎥ 3 3 3 3 2 2 3 ⋅ 2 2 2 3 ⋅ 2 ⎢ ⎥ ⎢ ⎥ ⎣ ⎦ ⎣ ⎦

√ √ √ √ √ √ 6 − 2 5 1 − 5 16 − 8 5 6 + 2 5 1 + 5 16 + 8 5 + − ]−[ + − ] =[ 8 2 24 8 2 24

√ √ √ √ √ √ √ √ √ 3+ 5 1+ 5 2+ 5 3− 5 1− 5 2− 5 7+5 5 7−5 5 5 5 =[ + − ]−[ + − ]= − = . 4 2 3 4 2 3 12 12 6

y

A x Exercise 191. Find the exact area bounded by the curve y = sin x and the line y = 0.5, for x ∈ (0, π/2).(Answer on p. 1071.) Page 492, Table of Contents

www.EconsPhDTutor.com

50.3

Area between Two Curves

Example 464. Find the area A bounded by the curves y = x2 − 2x − 1 and y = 1 − x2 . √ 1± 5 . So By the quadratic formula, the curves intersect at x = 2 A=∫

√ 0.5(1+ 5)

√ 0.5(1− 5)

1 − x2 − (x2 − 2x − 1) dx = 2 ∫

√ 5 5 + ] = , = 2 [x − 3 2 0.5(1−√5) 3 x3

√ 0.5(1+ 5) 2 x

√ 0.5(1+ 5)

√ 0.5(1− 5)

1 − x2 + x dx

where we’ve simply recycled our tedious calculations from the previous example.

y

x A

Exercise 192. Find exact area bounded by the curves y = 2 − x2 and y = x2 + 1. (Answer on p. 1071.)

Page 493, Table of Contents

www.EconsPhDTutor.com

50.4

Area below the x-Axis

Example 465. Find the area A bounded by x2 − 4 and the x-axis.

The definite integral calculates the signed area under the curve and above the x-axis. So if the curve is under the x-axis, the computed area will be negative, as we now see: 2

∫−2

8 −8 −32 x3 . x − 4 dx = [ − 4x] = ( − 8) − ( + 8) = 3 3 3 3 −2 2

2

But of course, an area is simply a magnitude, so we’ll take the absolute value and conclude 32 that the desired area is . 3

y

x A

Exercise 193. Find the exact area bounded by x4 − 16 and the x-axis. (Answer on p. 1072.) Page 494, Table of Contents

www.EconsPhDTutor.com

50.5

Area under a Parametrically-Defined Curve

Example 466. Consider the curve described by the equations x = t3 −2 and y = 4−t5 . Find the exact area bounded by the curve, the lines x = −2 and x = −1, and the horizontal axis. It helps to graph this curve on your graphing calculator:

Note that x = −1 ⇐⇒ t = 1, x = −2 ⇐⇒ t = 0, and dx/dt = 3t2 . So the area can be computed as: ∫x=−2 y dx = ∫x=−2 4 − t dx = ∫x=−2 (4 − t ) 3t dt = ∫t=0 x=−1

x=−1

5

x=−1

5

2

t=1

1

3t8 (4 − t ) 3t dt = [4t − ] = 4. 8 0 5

2

3

Exercise 194. Consider the curve described by the equations x = t2 + 2t and y = t3 − 1. Find the exact area bounded by the curve, the lines y = 1 and y = 2, and the vertical axis. (Answer on p. 1072.)

Page 495, Table of Contents

www.EconsPhDTutor.com

50.6

Volume of Rotation about the y- or x-Axis

Example 467. Consider the line y = 1. Rotate it about the x-axis to form an (infinite) 3D cylinder. Now consider the finite portion of the cylinder between x = 1 and x = 2. By a primary school formula, its volume is Base Area × Height = π12 × (2 − 1) = π.

Height

Radius Volume We can also compute this same volume using integration. The intuition is that we’re adding up infinitely-many infinitely-thin circle-shaped slices, laid on their sides, from x = 1 to x = 2 (left to right). The face of each of these circles has area πy 2 . In this particular example, y is constant (simply 1). Thus, the total volume is ∫1

Page 496, Table of Contents

2

πy 2 dx = ∫

2 1

π dx = [πx]1 = π. 2

www.EconsPhDTutor.com

Example 468. Rotate the line y = 3x about the x-axis to form an infinite double cone. Consider the finite portion of the cone between x = 0 and x = 2. By the formula for the 1 1 volume of a cone, we know its volume is πr2 h = π62 × 2 = 24π. 3 3 We can also compute this same volume using integration. Again, the intuition is that we’re adding up infinitely-many infinitely-thin circle shaped slices, from x = 0 to x = 2. Again, the face of each of these circles has area πy 2 . In this particular example, y = 3x. Thus, the total volume is 2

∫0 πy dx = ∫0 2

2

2

x3 π(3x) dx = 9π [ ] = 24π. 3 0 2

Height Radius

Volume

Now consider instead the finite portion of the cone between x = 3 and x = 5. This looks like a pedestal tilted sideways (not illustrated). We can easily compute its volume using integration: 5

∫3 πy dx = ∫3 2

5

5

x3 π(3x) dx = 9π [ ] = 294π. 3 3 2

Computing its volume using geometric formulae is possible, if slightly more tedious. The 1 1 finite portion of the cone between x = 0 and x = 3 is V1 = πr2 h = π92 × 3 = 81π. The finite 3 3 1 2 1 portion of the cone between x = 0 and x = 5 is V2 = πr h = π152 × 5 = 375π. Hence, the 3 3 desired volume is V = V2 − V1 = 375π − 81π = 294π. Page 497, Table of Contents

www.EconsPhDTutor.com

We can just as easily find the volume of rotation about the y-axis. Example 469. Consider the curve y = x2 . Find its volume of rotation about the y-axis, from y = 0 and y = 5.

In this case, there are no familiar geometric formulae we can apply. So we really just have to compute this same volume using integration. Again, the intuition is that we’re adding up infinitely-many infinitely-thin circle-shaped slices, but this time these circle-shaped slices are stacked from bottom to top, from y = 0 to y = 5. The face of each of these circles has area πx2 , where in this particular example, x2 = y. Thus, the total volume is 5

∫0 πx dy = ∫0 2

5

5

y2 πy dy = π [ ] = 12.5π. 2 0

Volume

Exercise 195. Compute the volume of rotation of y = sin x about the x-axis from x = 0 to x = π. (Answer on p. 1072.)

Page 498, Table of Contents

www.EconsPhDTutor.com

50.7

Finding Definite Integrals on your TI84

As usual, just one quick and very simple example. Example 470. Use your TI84 to find the approximate area bounded by the curve y = esin x and the horizontal axis, between x = 1 and x = 2. After Step 1.

After Step 2.

After Step 3.

After Step 4.

After Step 5.

After Step 6.

After Step 7.

After Step 8.

After Step 9.

After Step 10.

1. Press ON to turn on your calculator. 2. Press Y= . 3. Press blue 2ND button and then ex (which corresponds to the LN button). Then press SIN X,T,θ,n ) ) and altogether you will have entered esin x . 4. Now press GRAPH and the calculator will graph the given equation. 5. Press the blue 2ND button and then CALC (which corresponds to the TRACE button), to bring up the CALCULATE menu. 6. Press 7 to select the “∫ f (x) dx” option. This brings you back to the graph. 7. The TI84 is now prompting you for “Lower Limit?” Simply press 1 .

8. Now press ENTER and you will have told the TI84 that your lower limit is x = 1. 9. The TI84 is now similarly prompting you for “Upper Limit?” Simply press 2 .

10. Now press ENTER and you will have told the TI84 that your upper limit is x = 2. The TI84 also informs you that “∫ f (x) dx = 2.60466115”. This is our desired area (which is now also kindly shaded in black by our TI84.) Page 499, Table of Contents

www.EconsPhDTutor.com

51

Differential Equations 51.1

dy = f (x) dx

dy = f (x) is simply equivalent to y = ∫ f dx. dx

x3 dy 2 2 = x . Easy: y = ∫ x dx = +C, where as usual C is the constant Example 471. Solve dx 3 of integration.

This is the general solution to the given differential equation. It is general because C is free to vary and so there are many possible solutions for y.

However, suppose we are given also an additional piece of information: x = 0 Ô⇒ y = 1. Such information is often called an initial condition. Here’s why. It might be that y is the number of bats in a cave and x is time. Then the initial condition tells us that at time x = 0 (i.e. “initially”), there is y = 1 bat in the cave. Over time, the bats in the cave grow dy = x2 . according to the differential equation dx 03 + C. We thus find that C = 1. With the initial condition x = 0 Ô⇒ y = 1, we have 1 = 3 x3 We thus have that y = + 1. This is the particular solution to the given differential 3 equation (with given initial condition). Example 472. Solve

dy = sin x. dx

y = ∫ sin x dx = − cos x + C, where as usual C is the constant of integration. Again, this is the general solution to the given differential equation.

If we are given the initial condition that x = 0 Ô⇒ y = 1, then we can write 1 = − cos 0 + C and find that C = 2. We thus have that y = − cos x + 2. This is the particular solution to the given differential equation (with given initial condition). dy = ex sin x. Find also the particular soludx tion, if given also the initial condition x = 0 Ô⇒ y = 1. (Answer on p. 1073.)

Exercise 196. Find the general solution of

Page 500, Table of Contents

www.EconsPhDTutor.com

51.2 By the Inverse Function Theorem,

dy = f (y) dx

dx 1 dy = dy , (for ≠ 0.) dy dx dx So given

dy 1 dx 1 = f (y), rearrange to get = (for f (y) ≠ 0). Equivalently, x = ∫ dy. dx f (y) dy f (y)

dy = y2. dx dx 1 1 −1 Rearrange to get = 2 (for y 2 ≠ 0 or y ≠ 0). Hence, x = ∫ 2 dy = + C (for y ≠ 0). dy y y y This is the general solution to the given differential equation.

Example 473. Solve

We will often be asked to express y in terms of x. If so, we can easily rearrange to get 1 y= (for x ≠ C). This is also the general solution to the given differential equation! C −x

If given also the initial condition x = 0 Ô⇒ y = 1, then we have 1=

1 Ô⇒ C = 1. C −0

1 is the particular solution to the given differential equation (with given 1−x initial condition). Thus, y =

Page 501, Table of Contents

www.EconsPhDTutor.com

dy = sin y. dx dx 1 Rearrange to get = = csc y (for y not an integer multiple of π). Hence, by Proposidy sin y tion 11, x = ∫ csc y dy = − ln ∣csc y + cot y∣ + C (for y not an integer multiple of π). This is the general solution for the given differential equation.

Example 474. Solve

Unfortunately, without more information, it is impossible to write y as a function of x, because for each given value of x, there are multiple possible values of y, as we now show, by manipulating that last equation:

⇐⇒

⇐⇒

x = − ln(csc y + cot y) + C e

C−x

2 cos2 (y/2) cos(y/2) y 1 + cos y = = = cot = csc y + cot y = sin y 2 sin(y/2) cos(y/2) sin(y/2) 2

y = 2 (cot−1 eC−x + 2mπ) , for any m ∈ Z.

That is, for each given value of x, there are infinitely-many possible values of y (one for each integer m). But now suppose we have the initial condition x = 3 Ô⇒ y = 3 = − ln ∣csc

π . In this case, we have 2

π π + cot ∣ + C = − ln ∣1∣ + C = C, 2 2

so that C = 3. We may write y = 2 (cot−1 e3−x + 2mπ). Moreover, plugging in the same values for x and y, we see that π π = 2 (cot−1 e3−3 + 2mπ) = + 4mπ. 2 2

Hence, m = 0 and y = 2 cot−1 e3−x . This is the particular solution to the given differential equation (with given initial condition)

dy = y 2 + 1. Find also the particular solution, dx given also the initial condition x = 0 Ô⇒ y = 1. (Answer on p. 1073.)

Exercise 197. Find the general solution of

Page 502, Table of Contents

www.EconsPhDTutor.com

51.3

d2 y = f (x) dx2

d2 y dy = f (x) is equivalent to = f dx which in turn is equivalent to y = ∫ (∫ f dx) dx. dx2 dx ∫

d2 y Example 475. Solve 2 = x2 . dx

dy x3 x3 x4 2 = x dx = + C1 . Next, y = ∫ + C1 dx = + C1 x + C2 . This is the general dx ∫ 3 3 12 solution to the given differential equation. If given the initial conditions x = 0 Ô⇒ y = 1 and x = 1 Ô⇒ y = 2, then we have 04 1= + 0C1 + C2 12 2=

14 + 1C1 + 1 12

Ô⇒ C2 = 1, Ô⇒ C1 =

11 . 12

x4 11 + x + 1 is the particular solution. Hence y = 12 12

d2 y Example 476. Solve 2 = sin x. dx dy = sin x dx = − cos x + C1 . Next, y = ∫ − cos x + C1 dx = − sin x + C1 x + C2 . This is the dx ∫ general solution to the given differential equation.

If given the additional pieces of information that x = 0 Ô⇒ y = 1 and x = π Ô⇒ y = 2, then we we have 1 = − sin 0 + 0C1 + C2 2 = − sin π + πC1 + 1

Ô⇒ C2 = 1, 1 Ô⇒ C1 = . π

1 Hence y = − sin x + x + 1 is the particular solution. π

d2 y Exercise 198. Find the general solution of = ex sin x. Find also the particular 2 dx solution, given also that x = 0 Ô⇒ y = 1.(Answer on p. 1074.) Page 503, Table of Contents

www.EconsPhDTutor.com

51.4

Word Problems

The H2 Maths syllabuses require that you know how to • Formulate a differential equation from a problem situation; and • Interpret a differential equation and its solution in terms of a problem situation. So that’s what we’ll do in this section. Example 477. A plate of bacteria grows at a rate that is inversely proportional to the number of bacteria. Express the number of bacteria as a function of time. Let x be the number of bacteria. Let t be time. We are given that x grows in inverse dx k proportion to t. In other words, = , for some constant k ∈ R. Rearranging, we have dt x dt x = . Thus, dx k t=∫

x2 x dx = + C. k k

√ Further rearranging,√ we have x = ± k(t − C), where of course the negative root may be rejected. Hence, x = k(t − C).

Suppose we are also given that t = 0 Ô⇒ x = 1 and t = 1 Ô⇒ x = 2. Then we have 1= a

√

k(−C) and 2 = b

√

k(1 − C).

From =, we have C = −1/k. Plug this into = and we have 4 = k(1 √ + 1/k) = k + 1 or k = 3. Hence C = −1/3. Altogether then, the particular solution is x = 3t + 1. a

Page 504, Table of Contents

b

www.EconsPhDTutor.com

Exercise 199. Follow these steps to find the escape velocity (of an object from Earth). (Answer on p. 1075.) (a) The law of gravitation states that the force of attraction F between two point masses M and m is proportional to the product of their masses and inversely proportional to the square of the distance r between them. Write down this law in the form of an equation. Your answer should contain a constant — name this constant G (this is the gravitational constant). Momentum is defined as the product of mass m and velocity v. Newton’s Second Law of Motion states that force is the rate of change of momentum. (b) (i) Write down Newton’s Second Law in the form of an equation. (ii) Assume that mass m is constant. Explain why F = m

dv . dt

Now suppose M and m are, respectively, the masses of the Earth and a small ball. Assume that • The Earth is a perfect sphere with radius R m. • You can treat the Earth as a single point with its mass concentrated at the centre of the sphere. Thus, the initial distance between the Earth’s centre of mass and the ball is R + x m.

• Upwards (away from the Earth) is the positive direction and downwards (towards the centre of the Earth) is the negative direction. • The Earth is immobile. • There is no air resistance or any other form of friction. (c) The small ball is initially held at rest, x m above the surface of the Earth. It is then GM dv released. Let v be the velocity of the ball. Explain why 2 = − . (In particular, r dt explain why there is a negative sign.) (... Exercise continued on the next page ...)

Page 505, Table of Contents

www.EconsPhDTutor.com

(... Exercise continued from the previous page ...) From the equation in (c), we may write: R GM dv dr = ∫R+x r2 ∫R+x − dt dr. R

Let vs be the velocity at which the ball hits the surface of the Earth. (d) (i) Show that the LHS of the above equation is equal to GM (−

1 1 + ). R R+x

vs2 (ii) Show that the RHS of the above equation is equal to − . (Hint 1: Use Integration 2 dr by substitution. Hint 2: What is ?) dt √ 1 1 (iii) Hence show that vs = − 2GM ( − ). Again, explain why vs is negative. R R+x

Suppose instead that the small ball is initially at rest on the surface of the earth. It is then propelled upwards at a velocity V . (e) Explain why the ball will reach a maximum height of x m, where V √ 1 1 2GM ( − ), before falling back down to the earth. R R+x

=

(f) The escape velocity ve is the velocity with which we must propel the ball√ upwards 2GM (from its initial resting position on the surface of the earth). Explain why ve = . R

(g) Given√ that G = 6.674 × 10−11 m3 kg-1 s-2 , M = 5.972 × 1024 kg, and R = 6, 371 km, 2GM compute (express your answer in km s-1 , correct to 4 significant figures). R

Page 506, Table of Contents

www.EconsPhDTutor.com

51.5

Family of Solution Curves to Represent the General Solution SYLLABUS ALERT

This is in the 9740 (old) syllabus, but not in the 9758 (revised) syllabus. So you can skip this section if you’re taking 9758. Example 478. The general solution to

dy x3 = x2 is y = ∫ x2 dx = + C. dx 3

x3 The corresponding family of solution curves is the set of equations {y = + C ∶ C ∈ R}. 3 This family is illustrated below.

y

x

Given also the initial condition x = 0 Ô⇒ y = 1, we find that C = 1. This particular x3 solution y = + 1 is highlighted in red above. 3 d2 y Exercise 200. Sketch five members of the family of solution curves for = x, given dx2 also that x = 0 Ô⇒ y = 1. (Answer on p. 1077.) Page 507, Table of Contents

www.EconsPhDTutor.com

Part VI

Probability and Statistics

Page 508, Table of Contents

www.EconsPhDTutor.com

52

How to Count: Four Principles

How many arrangements or permutations are there of the three letters in CAT? For example, one possible permutation of CAT is TCA. To solve this problem, one possible method is the method of enumeration. That is, simply list out (enumerate) all the possible permutations. ACT,

ATC,

CAT,

CTA,

TAC,

TCA.

We see that there are 6 possible permutations. Enumeration works well enough when we have just three letters, as in CAT. Indeed, enumeration is sometimes the quickest method. In contrast, the 13 letters in the word UNPREDICTABLY have 6, 227, 020, 800 possible permutations. So enumeration is probably not practical. To help us count more efficiently, we’ll learn about four basic principles of counting: 1. The Addition Principle (AP); 2. The Multiplication Principle (MP); 3. The Inclusion-Exclusion Principle (IEP); and 4. The Complements Principle (CP).

Page 509, Table of Contents

www.EconsPhDTutor.com

52.1

How to Count: The Addition Principle

The addition principle (AP) is very simple. Example 479. For lunch today, I can either go to the food court or the hawker centre. At the food court, I have 2 choices: ramen or briyani. At the hawker centre, I have 3 choices: bak chor mee, nasi lemak, or kway teow. Altogether then, I have 2 + 3 = 5 choices of what to eat for lunch today. Here’s an informal statement of the AP:59 The Addition Principle (AP). I have to choose a destination, out of two possible areas. At area #1, there are p possible destinations to choose from. At area #2, there are q possible destinations to choose from. The Addition Principle (AP) simply states that I have, in total, p + q different choices.

(Just so you know, the AP is sometimes also called the Second Principle of Counting or the Rule of Sum or the Disjunctive Rule.)

Of course, the AP generalities to cases where there are more than just 2 “areas”. It may seem a little silly, but just to illustrate, let’s use the AP to tackle the CAT problem:

59

See section 85.1 in the Appendices (optional) for a more precise statement of the AP.

Page 510, Table of Contents

www.EconsPhDTutor.com

Example 480. Problem: How many permutations are there of the letters in the word CAT? We can divide the possibilities into three cases: Case #1. First letter is an A. Then the next two letters are either CT or TC — 2 possibilities. Case #2. First letter is a C. Then the next two letters are either AT or TA — 2 possibilities. Case #3. First letter is a T. Then the next two letters are either AC or CA — 2 possibilities. Altogether then, by the AP, there are 2 + 2 + 2 = 6 possibilities. That is, there are 6 possible permutations of the letters in CAT. These are illustrated in the tree diagram below.

Page 511, Table of Contents

www.EconsPhDTutor.com

The next exercise is very simple and just to illustrate again the AP. Exercise 201. Without retracing your steps, how many ways are there to get from the Starting Point to the River (see figure below)? (Answer on p. 1078.)

Exercise 202. How many permutations are there of the letters in the word DEED? Illustrate your answer with a tree diagram similar to that given in the CAT example above. (Answer on p. 1078.)

Page 512, Table of Contents

www.EconsPhDTutor.com

52.2

How to Count: The Multiplication Principle

Example 481. For lunch today, I can either have prata or horfun. For dinner tonight, I can have McDonald’s, KFC, or Pizza Hut. Enumeration shows that I have a total of 6 possible choices for my two meals today: (Prata, McDonald’s), (Prata, KFC), (Prata, Pizza Hut), (Horfun, McDonald’s), (Horfun, KFC), (Horfun, Pizza Hut). Alternatively, we can use the Multiplication Principle (MP). I have 2 choices for lunch and 3 choices for dinner. Hence, for my two meals today, I have in total 2 × 3 = 6 possible choices. Here’s an informal statement of the MP:60 The Multiplication Principle (MP). I have to choose two destinations, one from each of two possible areas. At area #1, there are p possible destinations to choose from. At area #2, there are q possible destinations to choose from. The Multiplication Principle (AP) simply states that I have, in total, p × q different choices. (The MP is sometimes also called the Fundamental or First Principle of Counting or the Rule of Product or the Sequential Rule.)

Of course, the MP generalities to cases where there are more than just 2 “areas”. Here’s an example where we have to make 3 decisions:

60

See section 85.1 in the Appendices (optional) for a more precise statement of the MP.

Page 513, Table of Contents

www.EconsPhDTutor.com

Example 482. For breakfast tomorrow, I can have shark’s fin or bird’s nest (2 choices). For lunch, I can have black pepper crab or curry fishhead (2 choices). For dinner, I can have an apple, a banana, or a carrot (3 choices). By the MP, for tomorrow’s meals, I have a total of 2 × 2 × 3 = 12 possible choices. We can enumerate these (I’ll use abbreviations): (SF, BPC, A),

(SF, BPC, B),

(SF, BPC, C),

(SF, CF, A),

(SF, CF, B),

(SF, CF, C),

(BN, BPC, A),

(BN, BPC, B),

(BN, BPC, C),

(BN, CF, A),

(BN, CF, B),

(BN, CF, C).

Example 483. Problem: How many four-letter words can be formed using the letters in the 26-letter alphabet? Let’s rephrase this problem so that it is clearly in the framework of the MP. We have 4 blank spaces to be filled: _ _ _ _. 1 2 3 4 These 4 blanks spaces correspond to 4 decisions to be made. Decision #1: What letter to put in the first blank space? Decision #2: What letter to put in the second blank space? Decision #3: What letter to put in the third blank space? Decision #4: What letter to put in the fourth blank space? How many choices have we for each decision? For Decision #1, we can put A, B, C, ..., or Z. So we have 26 choices for Decision #1. For Decision #2, we can again put A, B, C, ..., or Z. So we again have 26 choices for Decision #2. We likewise have 26 choices for Decision #3 and also 26 choices for Decision #4. Altogether then, by the MP, there are 26 × 26 × 26 × 26 = 264 = 456, 976 ways to make our four decisions. Solution: There are 264 = 456, 976 possible four-letter words that can be formed using the 26-letter alphabet.

Page 514, Table of Contents

www.EconsPhDTutor.com

Example 484. One 18-sided die has the numbers 1 through 18 printed on each of its sides. Another six-sided die has the letters A, B, C, D, E, and F printed on each of its sides. We roll the two dice. How many distinct possible outcomes are there? Again, let’s rephrase this problem in the framework of the MP. Consider 2 blank spaces: _ _. 1 2 These 2 blank spaces correspond to 2 decisions to be made. Decision #1: What number to put in the first blank space? Decision #2: What letter to put in the second blank space? Again we ask: How many choices have we for each decision? For Decision #1, we can put 1, 2, 3, ..., or 18. So we have 18 choices for Decision #1. For Decision #2, we can put A, B, C, D, E, or F. So we have 6 choices for Decision #2. Altogether then, by the MP, there are 18 × 6 = 108 ways to make our two decisions. In other words, there are 108 possible outcomes from rolling these two dice.

(If necessary, it is tedious but not difficult to enumerate them: 1A, 1B, 1C, 1D, 1E, 1F, 2A, 2B, ..., 17E, 17F, 18A, 18B, 18C, 18D, 18E, and 18F.)

Exercise 203. A club as a shortlist of 3 men for president, 5 animals for vice-president, and 10 women for club mascot. How many possible ways are there to choose the president, the vice-president, and the mascot? (Answer on p. 1079.) Exercise 204. (Answer on p. 1079.) The highly-stimulating game of 4D consists of selecting a four-digit number, between 0000 and 9999 (so there are 10, 000 possible numbers). Your mother tells you to go to the nearest gambling den (also known as a Singapore Pools outlet) to buy any three numbers, subject to these two conditions: • The four digits in each number are distinct. • Each four-digit number is distinct. How many possible ways are there to fulfil your mother’s request?

Page 515, Table of Contents

www.EconsPhDTutor.com

52.3

How to Count: The Inclusion-Exclusion Principle

The Inclusion-Exclusion Principle (IEP) is another very simple principle. Example 485. For lunch today, I can either go to the food court or the hawker centre. At the food court, I have 4 choices of cuisine: Chinese, Indian, Malay, and Western. At the hawker centre, I have 3 choices of cuisine: Chinese, Malay, and Thai. There are 2 choices of cuisine that are common to both the food court and the hawker centre (Chinese and Malay). And so by the Inclusion-Exclusion Principle (IEP), I have in total 4 + 3 − 2 = 5 choices of cuisine. The Venn diagram below illustrates. Why do we subtract 2? If we simply added the 4 choices available at the food court to the 3 available at the hawker centre, then we’d double-count the Chinese and Malay cuisines, which are available at both the food court and the hawker centre. And so we must subtract the 2 cuisines that are at both locations.

Page 516, Table of Contents

www.EconsPhDTutor.com

Example 486. Problem: How many integers between 1 and 20 are divisible by 2 or 5? There are 10 integers divisible by 2, namely 2, 4, 6, 8, 10, 12, 14, 16, 18, and 20. There are 4 integers divisible by 5, namely 5, 10, 15, and 20. There are 2 integers divisible by BOTH 2 and 5, namely 10 and 20. Hence, by the IEP, there are 10 + 4 − 2 = 12 integers that are divisible by either 2 or 5. (These are namely 2, 4, 5, 6, 8, 10, 12, 14, 15, 16, 18, and 20.) Here’s an informal statement of the IEP:61 The Inclusion-Exclusion Principle (IEP). I have to choose a destination, out of two possible areas. At area #1, there are p possible destinations to choose from. At area #2, there are q possible destinations to choose from. Areas #1 and #2 overlap — they have r destinations in common. The IEP simply states that I have, in total, p + q − r different choices. Exercise 205. (Answer on p. 1080.) The food court has 4 types of cuisine: Chinese, Indonesian, Korean, and Western. The hawker centre has 3: Chinese, Malay, and Western. A restaurant has 3: Chinese, Japanese, or Malay. In total, how many different types of cuisine are there? Illustrate your answer with a Venn diagram.

61

See section 85.1 in the Appendices (optional) for a more precise statement of the IEP.

Page 517, Table of Contents

www.EconsPhDTutor.com

52.4

How to Count: The Complements Principle

The Complements Principle (CP) is another very simple principle. Example 487. The food court has 4 types of cuisine: Chinese, Malay, Indian, and Other. I’m at the food court but don’t feel like eating Malay or Chinese. So by the Complements Principle (CP), I have 4 − 2 = 2 possible choices of cuisine (Indian and Other). Here’s an informal statement of the CP:62 The Complements Principle (CP). There are p possible destinations. I must choose one. I rule out q of the possible destinations. The Complements Principle says that I am left with p − q possible choices. Exercise 206. There are 10 Southeast Asian countries, of which 3 (Brunei, Indonesia, and the Philippines) are not on the mainland. How many mainland Southeast Asian countries are there that a European tourist can visit? (Answer on p. 1080.)

62

See section 85.1 in the Appendices (optional) for a more precise statement of the CP.

Page 518, Table of Contents

www.EconsPhDTutor.com

53

How to Count: Permutations

In this chapter, we’ll use the MP to generate several more methods of counting. But first, some notation you should find familiar from secondary school: Definition 105. Let n ∈ Z+0 . Then n-factorial, denoted n!, is defined by n! = n×(n−1)×⋅ ⋅ ⋅×1 for n ≥ 1 and 0! = 1. Example 488. 0! = 1, 1! = 1, 2! = 2× = 2, 3! = 3 × 2 × 1 = 6, 4! = 4 × 3 × 2 × 1 = 24, 5! = 5 × 4 × 3 × 2 × 1 = 120. Exercise 207. Compute 6!, 7!, and 8!. (Answer on p. 1081.)

We now revisit the CAT problem, using the MP: Example 489. Problem: How many permutations (or arrangements) are there of the three letters in the word CAT? Let’s rephrase this problem in the framework of the MP. Consider three blank spaces: _ _ _. 1 2 3 These 3 blank spaces correspond to 3 decisions to be made. Decision #1: What letter to put in the first blank space? Decision #2: What letter to put in the second blank space? Decision #3: What letter to put in the third blank space? Again we ask: How many choices have we for each decision? For Decision #1, we can put C, A, or T. So we have 3 choices for Decision #1. Having already used up a letter in Decision #1, we are left with two letters. So we have 2 choices for Decision #2. Having already used up a letter in Decision #1 and another in Decision #2, we are left with just one letter. So we have only 1 choice for Decision #3. Altogether then, by the MP, there are 3×2×1 = 3! = 6 possible ways of making our decisions. This is also the number of ways there are to arrange the three letters in the word CAT. Page 519, Table of Contents

www.EconsPhDTutor.com

Let’s now try the UNPREDICTABLY problem. Example 490. Problem: How many ways permutations are there of the 13 letters in the word UNPREDICTABLY? Again, let’s rephrase this problem in the framework of the MP. Consider 13 blank spaces: _ _ _ _ _ _ _ _ _ _ _ _ _. 1 2 3 4 5 6 7 8 9 10 11 12 13 These 13 blanks spaces correspond to 13 decisions to be made. Decision #1: What letter to put in the first blank space? Decision #2: What letter to put in the second blank space? ... Decision #13: What letter to put in the 13th blank space? Again we ask: How many choices have we for each decision? First an important note: In the word UNPREDICTABLY, no letter is repeated. (Indeed, UNPREDICTABLY is the longest “common” English word without any repeated letters.) For Decision #1, we can put U, N, P, R, E, D, I, C, T, A, B, L, or Y. So we have 13 choices for Decision #1. For Decision #2, having already used up a letter in Decision #1, we are left with 12 letters. So we have 12 choices for Decision #2. For Decision #3, having already used up a letter in Decision #1 and another letter in Decision #2, we are left with 11 letters. So we have 11 choices for Decision #3. ⋮

For Decision #13, having already used up a letter in Decision #1, another in Decision #2, another in Decision #3, ..., and another in Decision #12, we are left with one letter. So we have 1 choice for Decision #13. Altogether then, by the MP, there are 13 × 12 × ⋅ ⋅ ⋅ × 2 × 1 = 13! = 6, 227, 020, 800 possible ways of making our decisions. This is also the number of ways there are to arrange the 13 letters in the word UNPREDICTABLY.

Page 520, Table of Contents

www.EconsPhDTutor.com

The next fact simply summarises what should already be obvious from the above examples: Fact 62. There are n! possible permutations of n distinct objects.

Here is an informal proof of the above fact.63 Consider n empty spaces. We are to fill them with the n distinct objects. _ _ _ . . . _. 1 2 3 n For space #1, we have n possible choices. For space #2, we have n − 1 possible choices (because one object was already placed in space #1). ... And finally for space #n, we have only 1 object left and thus only 1 choice. By the MP then, there are n × (n − 1) × ⋅ ⋅ ⋅ × 1 = n! possible ways of filling in these n spaces with the n distinct objects. Example 491. The word COWDUNG has seven distinct letters. Hence, there are 7! = 5040 permutations of the letters in the word COWDUNG.

63

This is informal because, amongst other omissions, we haven’t yet given a precise definition of the term permutation.

Page 521, Table of Contents

www.EconsPhDTutor.com

53.1

Permutations with Repeated Elements

In the previous section, we saw that there are 3! permutations of the three letters in the word CAT and 13! permutations of the 13 letters in the word UNPREDICTABLY. We made an important note: In each of these words, there was no repeated letter. We now consider permutations of a set where some elements are repeated. Example 492. How many permutations are there of the three letters in the word SEE? A naïve application of the MP would suggest that the answer is 3! = 6. This is wrong. Enumeration shows that there are only 3 possible permutations: EES,

ESE,

SEE.

To see why a naïve application of the MP fails, set up the problem in the framework of the MP. Consider 3 blank spaces: _ _ _. 1 2 3 These 3 blanks spaces correspond to 3 decisions to be made. Decision #1: What letter to put in the first blank space? Decision #2: What letter to put in the second blank space? Decision #3: What letter to put in the third blank space? Again we ask: How many choices have we for each decision? For Decision #1, we can put E or S. So we have 2 choices for Decision #1. But now the number of choices available for Decision #2 depends on what we chose for Decision #1! (If we chose E in Decision #1, then we again have 2 choices for Decision #2. But if instead we chose S in Decision #2, then we now have only 1 choice for Decision #2.) This violates the implicit but important assumption in the MP that the number of choices available in one decision is independent on the choice made in the other decision. Hence, the MP does not directly apply. The reason SEE has only 3 possible permutations (instead of 3! = 6) is that it contains a repeated element, namely E. But why would this make any difference? ˆ so that the word SEE is now transTo understand why, let’s rename the second E as E, ˆ From the three letters of this new word, we’d again have formed into a new word SEE. 3! = 6 possible permutations: ˆ EES,

ˆ EES,

ˆ ESE,

ˆ ESE,

SEE,

ˆ SEE.

(... Example continued on the next page ...) Page 522, Table of Contents

www.EconsPhDTutor.com

(... Example continued from the next page ...) ˆ we see that there are 2! = 2 ways to permute Restricting attention to the two letters EE, these two letters. Hence, any single permutation (in the case where we do not distinguish between the two E’s) corresponds to 2 possible permutations (in the case where we do). The figure below illustrates how the 3 permutations of SEE correspond to the 6 permutations ˆ in SEE.

Hence, when we do not distinguish between the two E’s, there are only half as many possible permutations. We next consider permutations of SASS. Example 493. How many permutations are there of the four letters in the word SASS? The answer is 4!/3! = 4. Let’s see why.

ˆ and S, ¯ then we’d If we distinguish between the three S’s, perhaps by calling them S, S, ˆS. ¯ have 4! = 24 possible permutations of the letters in the word SAS

ˆS, ¯ SS ¯S, ˆ But amongst the three S’s themselves, we have 3! = 6 possible permutations: SS ˆ S, ¯ SS ¯ S, ˆ S ˆSS, ¯ and S ¯SS. ˆ So distinguishing between the three S’s increases by 6-fold the SS number of possible permutations. Working backwards, the word SASS thus has one-sixth ˆS. ¯ That is, SASS has 4!/3! = 4 possible permutations. as many permutations as SAS The figure below illustrates how the 4 possible permutations of SASS correspond to the 24 ˆS. ¯ possible permutations of SAS

Page 523, Table of Contents

www.EconsPhDTutor.com

Example 494. How many permutations are there of the four letters in the word DEED? Answer:

4! . 2!2!

In the numerator, the 4! corresponds to the total of 4 letters. In the denominator, the 2! corresponds to the 2 D’s and the 2! corresponds to the 2 E’s. Where do these numbers come from? Let x be the number of permutations of DEED (i.e. x is our desired answer). If we distinguish between the two D’s, then we’d increase by 2!-fold the number of possible permutations, to x ⋅ 2!. If, in addition, we distinguish between the 2 E’s, then we’d increase again by 2!-fold the number of possible permutations, to x ⋅ 2! ⋅ 2!. But we know that if all 4 letters are distinct, then there are 4! possible permutations. Therefore,

Rearrangement yields the answer:

x ⋅ 2! ⋅ 2! = 4! x=

4! = 6. 2!2!

You can go back and check that this answer is consistent with our answer for Exercise 202 (above). We next consider permutations of ASSESSES.

Page 524, Table of Contents

www.EconsPhDTutor.com

Example 495. Problem: How many permutations are there of the eight letters in the word ASSESSES? Answer:

8! . 2!5!

In the numerator, the 8! corresponds to the total of 8 letters. In the denominator, the 2! corresponds to the 2 E’s and the 5! corresponds to the 5 S’s. Where do these come from? Let y be the number of permutations of ASSESSES (i.e. y is our desired answer). If we distinguish between the two E’s, then we’d increase by 2!-fold the number of possible permutations, to y ⋅ 2!. If, in addition, we distinguish between the 5 S’s, then we’d increase again by 5!-fold the number of possible permutations, to y ⋅ 2! ⋅ 5!. But we know that if all 8 letters are distinct, then there are 8! possible permutations. Therefore,

Rearrangement yields the answer:

y ⋅ 2! ⋅ 5! = 8! y=

Page 525, Table of Contents

8! . 2!5!

www.EconsPhDTutor.com

In general, Fact 63. Consider n objects, only k of which are distinct. Let r1 , r2 , . . . , and rk be the numbers of times the 1st, 2nd, . . . , and kth distinct objects appear. (So r1 + r2 + ⋅ ⋅ ⋅ + rk = n.) Then the number of possible ways to permute these n objects is n! . r1 !r2 ! . . . rk ! More examples: Example 496. How many permutations are there of the six letters in the word BANANA? We have three distinct letters — B, A, and N. The letter B appears 1 time. The letter A appears 3 times. The letter N appears 2 times. Hence, by the above Fact, the number of possible permutations of these 6 letters is 6! = 60. 1!3!2!

Of course, 1! is simply equal to 1. So for the denominator, we shall usually not bother to write out any 1!. So we will normally instead write that the number of permutations of BANANA is: 6! = 60. 3!2! Example 497. How many permutations are there of the 11 letters in the word MISSISSIPPI? We have four distinct letters — M, I, S, and P. The letter M appears 1 time. The letter I appears 4 times. The letter S appears 4 times. The letter P appears 2 times. Hence, by the above Fact, the number of possible permutations of these 11 letters is 11! = 34, 650. 4!4!2! Exercise 208. There are 3 identical white tiles and 4 identical black tiles. How many ways are there of arranging these 7 tiles in a row? (Answer on p. 1081.)

Page 526, Table of Contents

www.EconsPhDTutor.com

53.2

Circular Permutations

Informal Definition. Two circular permutations are equivalent if one can be transformed into another by means of a rotation. Example 498. There are 3! = 6 (linear) permutations of CAT. That is, there are 3! = 6 possible ways to fill them into these 3 linearly-arranged spaces: ___ 1 2 3 In contrast, there are only 2! = 2 circular permutations of CAT. That is, there are only 2! = 2 possible ways to fill them into these 3 circularly-arranged spaces:

Let’s see why there are only 2 circular permutations of CAT. (... Example continued on the next page ...)

Page 527, Table of Contents

www.EconsPhDTutor.com

(... Example continued from the previous page ...)

The three seemingly-different arrangements above are considered to be the same circular permutation. This is because any arrangement is simply a rotation of another. Take the left red arrangement, rotate it clockwise by one-third of a circle to get the middle green arrangement. Repeat the rotation to get the right blue arrangement. The second and only other circular arrangement of CAT is shown below. Again, these three seemingly-different arrangements are considered to be the same circular permutation. This is because any arrangement is simply a rotation of another. Take the left black arrangement, rotate it clockwise by one-third of a circle to get the middle pink arrangement. Repeat the rotation to get the right orange arrangement. Note importantly, that the arrangement (or three arrangements) below cannot be rotated to get the arrangement (or three arrangements) above. Hence, the arrangement below is indeed distinct from the arrangement above.

It turns out that in general, if we have n distinct objects, there are (n − 1)! ways to arrange them in a circle. So here there are only (3 − 1)! = 2! = 2 ways to arrange CAT in a circle. Page 528, Table of Contents

www.EconsPhDTutor.com

In general: Fact 64. n distinct objects have (n − 1)! circular permutations. Proof. Given n distinct objects, any 1 circular permutation can be rotated n times to obtain n distinct (linear) permutations. Hence, there are n times as many (linear) permutations as there are circular permutations. But we already know that there are n! (linear) permutations of n distinct objects. Hence, there are n!/n = (n − 1)! circular permutations of n distinct objects. Exercise 209. How many ways are there to seat 10 people in a circle? (Answer on p. 1081.)

Note that if there are repeated objects, then the problem is considerably more difficult. See Section 85.2 in the Appendices for a brief discussion.

Page 529, Table of Contents

www.EconsPhDTutor.com

53.3

Partial Permutations

Example 499. Using the 26-letter alphabet, how many 3-letter words can we form that have no repeated letters? This, of course, is simply the problem of filling in these 3 empty spaces using 26 distinct elements. For space #1, we have 26 possible choices. For space #2, we have 25. And for space #2, we have 24. ___ 1 2 3 By the MP then, the number of ways to fill the three spaces is 26 × 25 × 24. This is also the number of three-letter words with no repeated letters.

Problems like the above example crop up often enough to motivate a new piece of notation:

Definition 106. Let n, k be positive integers with n ≥ k. Then P (n, k), read aloud as n permute k, is defined by P (n, k) =

n! . (n − k)!

P (n, k) answers the following question: “Given n distinct objects and k spaces (where k ≤ n), how many ways are there to fill the k spaces?”

Just so you know, P (n, k) is also variously denoted nP k, Pkn , n Pk , etc., but we’ll stick solely with the P (n, k) in this textbook.

Example 530 (continued from above). The number of 3-letter words without repeated letters is simply P (26, 3) = 26!/23! = 26 × 25 × 24. Example 500. Problem: Using the 22-letter Phoenician alphabet, how many 4-letter words can we form that have no repeated letters?

This, of course, is simply the problem of filling in these 4 empty spaces using 22 distinct elements. So the answer is P (22, 4) = 22!/18! = 22 × 20 × 19 × 18 words. Exercise 210. Out of a committee of 11 members, how many ways are there to choose a president and a vice-president? (Answer on p. 1081.)

Page 530, Table of Contents

www.EconsPhDTutor.com

53.4

Permutations with Restrictions

Example 501. At a dance party, there are 7 heterosexual married couples (and thus 14 people in total). Problem #1. How many ways are there of arranging them in a line, with the restriction that every person is next to his or her partner? Think of there as being 7 units (each unit being a couple). There are 7! ways to arrange these 7 units in a line. Within each unit, there are 2 possible arrangements. Hence, in total, there are 7! × 27 possible arrangements. Problem #2. Repeat the above problem, but now for a circle, rather than a line.

There are 6! ways to arrange the 7 units in a circle. Within each unit, there are 2 possible arrangements. Hence, in total, there are 6! × 27 possible arrangements.

Problem #3. How many ways are there of arranging them in a circle, with the restriction that every man is to the right of his wife? There are 6! ways to arrange the 7 units in a circle. Within each unit, there is only 1 possible arrangement. Hence, in total, there are 6! possible arrangements.

Example 502. (I assume you’re familiar with the standard 52-card deck.)

(... Example continued on the next page ...)

Page 531, Table of Contents

www.EconsPhDTutor.com

(... Example continued from the previous page ...) Problem #1. Using a standard 52-card deck, how many ways are there of arranging any 3 cards in a line, with the restriction that no two cards of the same suit are next to each other? This is the problem of filling in 3 spaces with 52 distinct objects. For space #1, we have 52 possible choices. _ _ _. 1 2 3 For space #2, having picked a card of suit X for space #1, we must pick a card from some other suit Y. And so there are only 39 possible choices (we have three suits available — that’s 3 × 13 = 39).

For space #3, having picked a card of suit Y for space #2, we must pick a card from some other suit Z. Note that suit Z can be the same as suit X. And so there are 38 possible choices (we have three suits available, less the card used for space #1 — that’s 3 × 13 − 1 = 38). Altogether then, there are 52 × 39 × 38 possible arrangements.

Problem #2. Repeat the above problem, but now for a circle, rather than a line. One subtle thing is that, in addition to space #1 being of a different suit from space #2 and space #2 being of a different suit from space #3, we must also have that space #3 is of a different suit from space #1. Thus, there are 52 × 39 × 26 possible ways to fill in these three spaces, if they were in a line.

Since they are instead in a circle, there are 52 × 39 × 26 ÷ 3 possible ways to arrange three cards in a circle, with the condition that no two cards of the same suit are next to each other.

Exercise 211. (Answer on p. 1081.) There are 4 brothers and 3 sisters. In how many ways can they be arranged ... (a) in a line, without any 2 brothers being next to each other? (b) in a line, without any 2 sisters being next to each other? (c) in a circle, without any 2 brothers being next to each other? (d) in a circle, without any 2 sisters being next to each other?

Page 532, Table of Contents

www.EconsPhDTutor.com

54

How to Count: Combinations

P (n, k) is the number of ways we can fill k (ordered) spaces using n distinct objects.

In contrast, C(n, k) is the number of ways of choosing k out of n distinct objects. Equivalently, it is the same problem of filling k spaces using n distinct objects, except that now order does not matter.

Example 503. Suppose we have a committee of 13 members and wish to select a president and a vice-president. This is equivalent to the problem of filling in 2 spaces, given 13 distinct objects. __ 1 2 The answer is thus simply P (13, 2) = 13 × 12.

Suppose instead that we want to choose two co-presidents. How many ways are there of doing so? This is simply the same problem as before — again we want to fill in 2 spaces, given 13 distinct objects. The only difference now is that the order of the 2 chosen objects does not matter. So the answer must be that there are P (13, 2)/2! ways of choosing the two co-presidents.

Example 504. How many ways are there of choosing 5 cards out of a standard 52-card deck? _____ 1 2 3 4 5 First, how many ways are there to fill 5 spaces using 52 distinct objects (where order matters)? Answer: P (52, 5) = 52 × 51 × 50 × 49 × 48 = 311, 875, 200.

And so if we don’t care about order, we must adjust this number by dividing by 5! to get P (52, 5)/5! = 2, 598, 960. So the answer is that to choose 5 cards out of a 52-card deck, there are 2, 598, 960 ways. The above examples suggest that, in general, to choose k out of n given distinct objects, there are P (n, k)/k! possible ways. This motivates the following definition:

Page 533, Table of Contents

www.EconsPhDTutor.com

Definition 107. Let n, k be positive integers with n ≥ k. Then C(n, k), read aloud as n choose k, is defined by C(n, k) =

P (n, k) n! = . k! (n − k)!k!

It turns out that C(n, k) appears so often in maths that it has many alternative notations ⎛n⎞ — one of the most common is . ⎝k ⎠

“n choose k” also has several names, such as the combination, the combinatorial number, and even the binomial coefficient. Shortly, we’ll see why the name binomial coefficient makes sense.

Exercise 212 gives an alternate expression for C(n, k) which you’ll often find very useful. Exercise 212. (Answer on p. 1083.) Show that: C(n, k) =

n × (n − 1) × (n − 2) × ⋅ ⋅ ⋅ × (n − k + 1) . k!

Exercise 213. Compute C(4, 2), C(6, 4), and C(7, 3). (Answer on p. 1083.) Exercise 214. We wish to form a basketball team, consisting of 1 centre, 2 forwards, and 2 guards. We have available 3 centres, 7 forwards, and 5 guards. How many ways are there of forming a team? (Answer on p. 1083.)

Here’s a nice symmetry property: Fact 65. (Symmetry.) C(n, k) = C(n, n − k).

Proof. Choosing k out of n objects is the same as choosing which n − k out of n objects to ignore.

Page 534, Table of Contents

www.EconsPhDTutor.com

Example 505. We have a group of 100 men. 70 are needed for a task. The number of ways to choose these 70 men is: C(100, 70) =

100! . 30!70!

C(100, 30) =

100! . 70!30!

This is the same as the number of ways to choose the 30 men that will not be used for the task:

Page 535, Table of Contents

www.EconsPhDTutor.com

54.1

Pascal’s Triangle

Pascal’s Triangle consists of a triangle of numbers. If we adopt the convention that the topmost row is row 0 and the leftmost term of each row is the 0th term, then the nth row, k th term is the number C(n, k): 1 1 1 1 1 1 1 1

7

2 3

4 5

6

1 3

6 10

15 21

1

4 10

20 25

1

⋮

1 5

15 35

1 6

21

1 7

1

It turns out that beautifully enough, each term is equal to the sum of the two terms above it. The next exercise asks you to verify several instances of this: Exercise 215. Verify the following: (a) C(1, 0)+C(1, 1) = C(2, 1); (b) C(4, 2)+C(4, 3) = C(5, 3); (c) C(17, 2) + C(17, 3) = C(18, 3). (Answer on p. 1083.) Fact 66. (Pascal’s Rule/Identity/Relation.) C(n + 1, k) = C(n, k) + C(n, k − 1). Proof. C(n + 1, k) is the number of ways of choosing k out of n + 1 distinct objects.

Suppose we do not choose the last object, i.e. the (n + 1)th object. Then we have to choose our k objects out of the first n objects. There are C(n, k) ways of doing so. Suppose we do choose the last object. Then we have to choose another k − 1 objects, out of the first n objects. There are C(n, k − 1) ways of doing so. Altogether then, by the Addition Principle, there are C(n, k) + C(n, k − 1) ways of choosing k out of n + 1 distinct objects.

Page 536, Table of Contents

www.EconsPhDTutor.com

54.2

The Combination as Binomial Coefficient

Mathematics is the art of giving the same name to different things. - Henri Poincaré, p. 34 in Science and Method. Poincaré’s quote is especially true in combinatorics. In this section, we’ll learn why C (n, k) can be called the combination and also the binomial coefficient. Verify for yourself that the following equations are true:

(1 + x)0 = 1, (1 + x)1 = 1 + x, (1 + x)2 = 1 + 2x + x2 , (1 + x)3 = 1 + 3x + 3x2 + x3 , (1 + x)4 = 1 + 4x + 6x2 + 4x3 + x4 , (1 + x)5 = 1 + 5x + 10x2 + 10x3 + 5x4 + x5 , (1 + x)6 = 1 + 6x + 15x2 + 20x3 + 15x4 + 6x5 + x6 , (1 + x)7 = 1 + 7x + 21x2 + 35x3 + 35x4 + 21x5 + 7x6 + x7 . ⋮

Each of the expressions on the RHS is called a binomial series. Each can also be called the binomial expansion of (1 + x)n . Notice anything interesting? No? Try this exercise:

⎛7⎞ ⎛7⎞ ⎛7⎞ ⎛7⎞ ⎛7⎞ ⎛7⎞ ⎛7⎞ ⎛7 , , , , , , , ⎝0⎠ ⎝1⎠ ⎝2⎠ ⎝3⎠ ⎝4⎠ ⎝5⎠ ⎝6⎠ ⎝7 these to the coefficients of the binomial expansion of (1 + x)7 . What do (Answer on p. 1084.)

Exercise 216. Compute

⎞ . Compare ⎠ you notice?

It turns out that somewhat surprisingly, the coefficients of the binomial expansions of ⎛n⎞ ⎛n⎞ ⎛n⎞ (1 + x)n are simply , , ... . As an additional exercise, you should verify for ⎝ 0 ⎠ ⎝ 1 ⎠ ⎝n⎠ yourself that this is also true for n = 0 through n = 6.

There are several ways to explain why the combinatorial numbers also happen to be the binomial coefficients. Here we’ll give only the combinatorial explanation:

Page 537, Table of Contents

www.EconsPhDTutor.com

Consider (1 + x)2 . Expanding, we have

(1 + x)2 = (1 + x)(1 + x) = 1 ⋅ 1 + 1 ⋅ x + x ⋅ 1 + x ⋅ x.

Consider the 4 terms on the right.

For 1 ⋅ 1, we “chose” 1 from the first (1 + x) and 1 from the second (1 + x).

For 1 ⋅ x, we “chose” 1 from the first (1 + x) and x from the second (1 + x). For x ⋅ 1, we “chose” x from the first (1 + x) and 1 from the second (1 + x).

Finally, for x ⋅ x, we “chose” x from the first (1 + x) and x from the second (1 + x).

Ð→

⎫ ⎪ ⎪ ⎬ ⎪ ⎪ ⎭

Ð→

From the two (1 + x)’s in the product, there is C(2, 0) = 1 way to choose 0 of the x’s.

From the two (1 + x)’s in the product, there are C(2, 1) = 2 ways to choose 1 of the x’s.

From the two (1 + x)’s in the product, there is C(2, 2) = 1 way to choose 2 of the x’s.

Altogether then, the coefficient on x0 is C(2, 0) (“choose 0 of the x’s”), that on x1 is C(2, 1) (“choose 1 of the x’s”), and that on x2 is C(2, 1) (“choose 2 of the x’s”). That is: (1 + x)2 = C(2, 0)x0 + C(2, 1)x1 + C(2, 2)x2 = 1 + 2x + x2 .

Exercise 217. (Answer on p. 1084.) Mimicking what was just done above, explain why (1 + x)3 = C(3, 0)x0 + C(3, 1)x1 + C(3, 2)x2 + C(3, 3)x3 .

More generally, we have

Fact 67. Let n ∈ Z+ . Then

⎛ n ⎞ n−i i ⎛ n ⎞ n 0 ⎛ n ⎞ n−1 1 ⎛ n ⎞ n−2 2 ⎛n⎞ 0 n x y = x y + x y + x y + ⋅⋅⋅ + xy . ⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠ i 0 1 2 n i=0 n

(x + y)n = ∑

Page 538, Table of Contents

www.EconsPhDTutor.com

54.3

The Number of Subsets of a Set is 2n

By plugging x = 1, y = 1 into the last fact, we see that (1 + 1) = 2n is the sum of the terms in the nth row of Pascal’s triangle: Fact 68. Let n ∈ Z+ . Then

⎛n⎞ ⎛n⎞ ⎛n⎞ ⎛n⎞ ⎛n⎞ = + + + ⋅⋅⋅ + . ⎝n⎠ i=0 ⎝ i ⎠ ⎝ 0 ⎠ ⎝ 1 ⎠ ⎝ 2 ⎠ n

2 =∑ n

There’s a nice combinatorial interpretation of the above fact (Poincaré’s quote at work again). Consider the set S = {A, B}. S has 22 = 4 subsets: ∅ = {}, {A}, {B}, and S = {A, B}.

Now consider the set T = {A, B, C}. T has 23 = 8 subsets: ∅ = {}, {A}, {B}, {C}, {A, B}, {A, C}, {B, C}, and T = {A, B, C}.

In general, if a set has n elements, how many subsets does it have? We can couch this in the framework of the Multiplication Principle — this is really a sequence of n decisions of whether or not to include each element in the subset. There are 2 choices for each decision. Thus, there are 2n choices altogether. In other words, using a set of n elements, we can form 2n subsets. But of course, this must in turn be equal to the sum of the following: • C (n, 0) ways to form subsets with 0 elements;

• C (n, 1) ways to form subsets with 1 element; • C (n, 2) ways to form subsets with 2 elements; ...

• C (n, n) ways to form subsets with n elements.

Thus,

2n =

Page 539, Table of Contents

⎛n⎞ ⎛n⎞ ⎛n⎞ ⎛n⎞ + + + ⋅⋅⋅ + . ⎝ 0 ⎠ ⎝ 1 ⎠ ⎝ 2 ⎠ ⎝n⎠

www.EconsPhDTutor.com

Exercise 218. Verify that 27 =

⎛7⎞ ⎛7⎞ ⎛7⎞ ⎛7⎞ + + +⋅⋅⋅+ . (Answer on p. 1084.) ⎝0⎠ ⎝1⎠ ⎝2⎠ ⎝7⎠

Exercise 219. Using what you’ve learnt, write down (3 + x)4 . (Answer on p. 1085.)

Exercise 220. (Answer on p. 1085.) (a) The Tan family has 4 sons and the Wong family has 3 daughters. Using the sons and daughters from these two families, how many ways are there of forming 2 heterosexual couples? (b) The Lee family has 6 sons and the Ho family has 9 daughters. Using the sons and daughters from these two families, how many ways are there of forming 5 heterosexual couples?

Page 540, Table of Contents

www.EconsPhDTutor.com

55

Probability: Introduction

55.1

Mathematical Modelling

All models are wrong, but some are useful. - G.E.P. Box, p. 202 in Robustness in Statistics. Whenever we use maths in a real-world scenario, we have some mathematical model in mind. Here’s a very simple example just to illustrate: Example 506. We want to know how much material to purchase, in order to build a fence around a field. We might go through these steps: 1. Formulate a mathematical model: Our field is the shape of a rectangle, with length 100 m and breadth 50 m. 2. Analyse: The rectangle has perimeter 100 + 50 + 100 + 50 = 300 m. 3. Apply the results of our analysis: We need to buy enough material to build a 300-metre long fence. The figure below depicts how mathematical modelling works.

Starting with some real-world scenario, we go through these steps: 1. Formulate a mathematical model. That is, describe the real-world scenario in mathematical language and concepts. This first step is arguably the most important. It is often subjective — not everyone will agree that your mathematical model is the most appropriate for the scenario at hand. To use the above example, the field may not be a perfect rectangle, so some may object to your description of the field as a rectangle. Nonetheless, you may decide that all things considered, the rectangle is a good mathematical model. Page 541, Table of Contents

www.EconsPhDTutor.com

2. Analyse the model. This involves using maths and the rules of logic. (A-level maths exams tend to be mostly concerned with this second step.) In the above example, this second step simply involved computing the perimeter of the rectangle — 100 + 50 + 100 + 50 = 300 m. Of course, for the A-levels, you can expect the analysis to be more challenging than this. Note that this second step, in contrast to the first, is supposed to be completely watertight, non-subjective, and with no room for disagreement. After all, hardly anyone reasonable could disagree that a perfect rectangle with length 100 m and breadth 50 m has perimeter 300 m. 3. Apply your results. Now apply the results of your analysis to the real-world scenario. In the above example, pretend you’re a mathematical consultant hired by the fence-builder. Then your final report might simply say, “We recommend the purchase of 300 m worth of fence material.” This third and last step is, like the first, subjective and open to debate. It involves your interpretation of what the results of your analysis mean (in the real world) and your recommendation of what actions to take. For example, you find that the fence will have perimeter 300 m and thus recommend that 300 m of fence material be purchased. However, someone else, looking at the same result, might point out that the corners of the fence require additional or special material; she might thus make a slightly different recommendation.

We’ve secretly always been using mathematical modelling; we just haven’t always been terribly explicit about it. The foregoing discussion was placed here, because with probability and statistical models, we want to be especially clear about that we are doing mathematical modelling.

Page 542, Table of Contents

www.EconsPhDTutor.com

55.2

The Experiment as a Model of Scenarios Involving Chance

Real-world scenarios often involve chance. We can model such scenarios mathematically. For this purpose, we’ll use a mathematical object named the experiment, typically denoted E.64 An experiment E = (S, Σ, P) is an ordered triple65 composed of three objects, called the sample space S, the event space Σ (upper-case sigma), and the probability function P, where • The sample space S is simply the set of possible outcomes. • An event is simply any set of possible outcomes. In turn, the event space Σ is simply the set of all events. • The probability function P simply assigns to each event some probability between 0 and 1. This probability is interpreted as the likelihood of that particular event occurring. Examples:

64 65

An experiment is often instead called a probability triple or probability space or (probability) measure space. Previously, in the only ordered triples we encountered, the three terms were always simply real numbers. Here however, the first two terms are sets and the third is a function. Nonetheless, this is all the same an ordered triple, albeit a more complicated one.

Page 543, Table of Contents

www.EconsPhDTutor.com

Example 507. We model a coin-flip with the experiment E = (S, Σ, P). What are the sample space S, the event space Σ, and the probability function P? 1. S = {H, T }.

The sample space is simply the set of possible outcomes. The choice of the sample space belongs to Step #1 (Formulate a mathematical model) in the process of mathematical modelling. It is subjective and open to disagreement. For example, John (another scientist) might argue that the coin sometimes lands exactly on its edge. This is exceedingly unlikely but nonetheless possible — one empirical estimate is that the US 5-cent coin has probability 1 in 6000 of landing on its edge when flipped (source). So John might denote this third possible outcome X and his sample space would instead be S = {H, T, X}. 2. Event space Σ = {∅, {H}, {T }, {H, T }}.

An event is simply any subset of S. In other words, an event is simply some set of possible outcomes. So here, {H} is an event. So too is {T }. But there are also two other events, namely ∅ = {} (this is the event that never occurs) and S = {H, T } (this is the event that always occurs).

The event space is simply the set of events. In other words, the event space is the set of all subsets of S.* As we saw in Section 54.3, given any finite set S, there are 2∣S∣ possible subsets of S. In general, given a finite sample space S, the corresponding event space Σ always simply contains 2∣S∣ events. And so here, since there are 2 possible outcomes, there are, altogether, 22 = 4 possible events.

If the real-world outcome of the coin flip is Heads, then our interpretation (in terms of our model) is that “the events {H} and {H, T } occur”. If the real-world outcome of the coin flip is Tails, then our interpretation (in terms of our model) is that “the events {T } and {H, T } occur”.

The event ∅ never occurs, whatever the real-world outcome is. And the event S = {H, T } always occurs, whatever the real-world outcome is. (... Example continued on the next page ...)

*Provided S is finite. If S is infinite, then this sentence must be modified slightly — but this is well beyond the scope of the A-levels.

Page 544, Table of Contents

www.EconsPhDTutor.com

(... Example continued from the previous page ...) The mathematical modeller is free to select the sample space S she deems most appropriate. However, once she has selected the sample space S, the event space Σ is automatically determined by the rules of maths. There is no room for interpretation. Hence, the selection of the event space Σ belongs to Step #2 (Analysis) in the process of mathematical modelling. So likewise, John, who chooses S = {H, T, X} as his sample space, has no freedom to choose his event space Σ. It is automatically Σ = {∅, {H}, {T }, {X}, {H, T }, {H, X}, {T, X}, S} (consists of 8 elements). 3. Probability function P ∶ Σ → R.

The probability function simply assigns to each event a number (between 0 and 1) called a probability. So here, if heads and tails are “equally likely” (or the coin is “unbiased” or “fair”), then it makes sense to assign P (∅) = 0,

P ({H}) = P ({T }) = 0.5,

P(S) = 1.

The mathematical modeller has no freedom over the domain Σ and codomain R of the probability function. However, she does have freedom to choose the mapping rule she deems most appropriate. Hence, the act of choosing the mapping rule belongs to Step #1 (Formulation) in the process of mathematical modelling. So here, if told that heads and tails are “equally likely” (or that the coin is “unbiased” or “fair”), the mathematical modeller would naturally choose to assign probability 0.5 to each of the events {H} and {T }.

John, who chooses S = {H, T, X} as his sample space, might instead assign probability 1/6000 to the event {X} and probability 5999/12000 to each of the events {H} and {T }. Important Remark on Notation

It is correct and proper to write P ({H}) = P ({T }) = 0.5. It is incorrect and improper to write P (H) = P (T ) = 0.5. This is because the function P is of events (sets of outcomes) and NOT of outcomes themselves.

Nonetheless, we will often allow ourselves to be sloppy and write the “incorrect and improper” P (H) = P (T ) = 0.5. This is because the notation P ({H}) = P ({T }) = 0.5 can get rather messy. But you should always remember, even as you write P (H) = P (T ) = 0.5, that this is technically incorrect. Page 545, Table of Contents

www.EconsPhDTutor.com

Example 508. A real-world die-roll can be modelled by an experiment E = (S, Σ, P), where 1. S = {1, 2, 3, 4, 5, 6}. 2. Event space:

Σ = {∅, {1} , {2} , . . . , {6} , {1, 2} , {1, 3} , . . . , {5, 6} , {1, 2, 3} , {1, 2, 4} , . . . , {4, 5, 6} , . . . . . . , S}

There are 6 possible outcomes and thus 26 = 64 possible events. The event space, given above, is simply the set of all possible events.

If the real-world outcome of the die roll is 3, then the interpretation (in terms of our model) is that the following 32 events occur: {3}, {1, 3}, {2, 3}, . . . , {1, 2, 3}, {1, 3, 4}, . . . , S = {1, 2, 3, 4, 5, 6}. (These are simply the events that contain the outcome 3.)

Similarly, if the real-world outcome of the die roll is 5, then the interpretation is that 32 events occur. You should be able to list all 32 of these events on your own. 3. Probability function P ∶ Σ → R.

If the die is “unbiased” or “fair”, then it makes sense to assign 1 P({1}) = P({2}) = P({3}) = P({4}) = P({5}) = P({6}) = . 6

4 What about the other 58 events? It makes sense to assign, for example, P ({1, 3, 5, 6}) = . 6 In general, the mapping rule of the probability function can be fully specified as: For any event A ∈ Σ, P(A) =

∣A∣ ∣A∣ = . ∣S∣ 6

In words, given any event A, its probability P(A) is simply the number of elements it contains, divided by 6.

Page 546, Table of Contents

www.EconsPhDTutor.com

Here’s the formal definition of an experiment: Definition 108. An experiment is an ordered triple (S, Σ, P), where

• S, the sample space, is simply any set (interpreted as the set of possible outcomes in a real-world scenario involving chance). • Σ, the event space, is the set of possible events. • P, the probability function, has domain Σ, codomain R, and must satisfy the three Kolmogorov axioms (to be discussed below in Definition 109). Given any event A ∈ Σ, the number P(A) is called the probability of A.

For the probability function P, the mathematical modeller is free to choose the mapping rule she deems most appropriate. The only restriction is that P satisfies three axioms, called the Kolmogorov Axioms, to be discussed in the next section. Exercise 221. (Answers on pp. 1086, 1087, and 1088.) Consider each of the following real-world scenarios. (a) You pick, at random, a card from a standard 52-card deck. (b) You flip two fair coins. (c) You roll two fair dice. Model each of the above real-world scenarios as an experiment, by following steps (i) (iii): (i) Write down the appropriate sample space S. (ii) How many possible events are there? Hence, how many elements does the event space Σ contain? If it is not too tedious, write out Σ in full. (iii) What are the domain and codomain of the probability function P? Write down the probabilities of any three events. Given any event A ∈ Σ, what is P(A)?

(iv) In each scenario, explain briefly how John, another scientist, might justify choosing a different sample space, event space, and probability function.

Page 547, Table of Contents

www.EconsPhDTutor.com

55.3

The Kolmogorov Axioms

An axiom (or postulate) is a statement that is simply accepted as being true, without justification or proof. Example 509. Euclid’s parallel axiom says that “Two non-parallel lines in the plane eventually intersect”. Historically, this axiom was accepted as a “self-evident truth”, without need for justification or proof. However, in the 19th century, mathematicians discovered “non-Euclidean geometries”, in which the parallel axiom did not hold. These turned out to have significant implications for maths, philosophy, and physics. The above example illustrates that an axiom is not an eternal and immutable truth. Instead, it is merely a statement that some mathematicians tentatively accept as being true. Having listed a bunch of axioms, mathematicians then study their implications. In probability theory, we impose three axioms on the probability function. These can be thought of as restrictions on what the probability function looks like. Informally: 1. Probabilities can’t be negative. 2. The probability of an outcome occurring is 1. 3. The probability that one of two disjoint events occurs is the sum of the their individual probabilities. Formally: Definition 109. We say that a function P satisfies the three Kolmogorov axioms if: 1. Non-Negativity Axiom. For any event E ⊆ S, we have P(E) ≥ 0.

2. Normalisation Axiom. P(S) = 1. 3. Additivity Axiom. Given any two disjoint events E1 , E2 ⊆ S, we have P (E1 ∪ E2 ) = P (E1 ) + P (E2 ).*

*This additivity axiom is actually not quite the correct third Kolmogorov axiom. Strictly speaking, we want instead a countable-additivity ∞

axiom: Given any disjoint events E1 , E2 , . . . , we have P (∪∞ i=1 Ei ) = ∑ P (Ei ). But for the A-levels we’ll gloss over this. i=1

In case you’ve forgotten, two sets are disjoint if they have no elements in common.

Page 548, Table of Contents

www.EconsPhDTutor.com

55.4

Implications of the Kolmogorov Axioms

Obviously, P(∅) = 0 (the probability that the empty event occurs is 0). Previously, you’ve probably taken this and other “obvious” properties for granted. Now we’ll prove that they follow from the Kolmogorov axioms. Recall that given any set A, its complement Ac (sometimes also denoted A′ ) is defined to be “everything else” — more precisely, Ac is the set of all elements that are not in A. Proposition 12. Let P be a probability function and A, B be events. Then P satisfies the following properties: 1. Complements. P(A) = 1 − P (Ac ).

2. Probability of Empty Event is Zero. P(∅) = 0. 3. Monotonicity. If B ⊆ A, then P(B) ≤ P(A).

4. Probabilities Are At Most One. P(A) ≤ 1. 5. Inclusion-Exclusion. P(A ∪ B) = P(A) + P(B) − P(A ∩ B).

You may recognise that the Complements and the Inclusion-Exclusion properties are analogous to the CP and IEP from counting. Proof. 1. Complements. By definition, A ∩ Ac are disjoint. And so by the Additivity Axiom, P(A) + P(Ac ) = P(A ∪ Ac ).

Also by definition, A ∪ Ac = S. And so P(A ∪ Ac ) = P(S).

By the Normalisation Axiom, P(S) = 1.

Altogether then, P(A) + P(Ac ) = P(A ∪ Ac ) = P(S) = 1. Rearranging, P(A) = 1 − P (Ac ), as desired. The remainder of the proof is continued on p. 915 in the Appendices.

Venn diagrams are helpful for illustrating probabilities. Those below help to illustrate the four of the above five properties.

Page 549, Table of Contents

www.EconsPhDTutor.com

Exercise 222. Prove each of the following properties and illustrate with a Venn diagram: (a) “If two events A and B are mutually exclusive, then P(A ∩ B) = 0.” (b) “Let A, B, and C be events. Then P(A ∪ B ∪ C) = P(A) + P (Ac ∩ B) + P (Ac ∩ B c ∩ C).” (Answer on p. 1089.)

Page 550, Table of Contents

www.EconsPhDTutor.com

56

Probability: Conditional Probability

Example 510. Flip three fair coins. Model this as an experiment E = (S, Σ, P), where

• The sample space is S = {HHH, HHT, HT H, HT T, T HH, T HT, T T H, T T T }. • The event space Σ has 28 = 256 elements. • The probability function P ∶ Σ → R has mapping rule

1 P(HHH) = P(HHT ) = ⋅ ⋅ ⋅ = P(T T T ) = , 8

and more generally, for any event A ∈ Σ, P(E) =

∣A∣ . 8

Problem: Suppose there is at least 1 tail. Find the probability that there are at least 2 heads. There are 7 possible outcomes where there is at least 1 tail: HHT , HT H, HT T , T HH, T HT , T T H, and T T T . Each is equally likely to occur. Of these, 3 outcomes involve at least 2 heads (HHT , HT H, and T HH). Thus, given there is at least 1 tail, the probability that there are at least 2 heads is simply 3/7. The above analysis was somewhat informal. Here is a more formal analysis. Let A be the event that there are at least 2 heads: A = {HHT, HT H, T HH, HHH}. Let B be the event that there {HHT, HT H, HT T, T HH, T HT, T T H, T T T }.

is

at

least

1

tail:

A ∩ B is thus the event that there are at least 2 heads and 1 tail: {HHT, HT H, T HH}.

B

=

A ∩ B =

Our problem is equivalent to finding P(A∣B) — the conditional probability of A given B, which is given by: P(A∣B) =

Page 551, Table of Contents

P(A ∩ B) 3/8 3 = = . P(B) 7/8 7

www.EconsPhDTutor.com

Example 511. Let P be a probability function and A, B ∈ Σ be events.

• P(A) = 0.5 (the probability that A occurs is 0.5). • P(B) = 0.6 (the probability that B occurs is 0.6).

• P(A ∩ B) = 0.2 (the probability that both A and B occur is 0.2).

Hence, given that B has occurred, the probability that A has also occurred is simply 0.2/0.6 = 1/3. (The information that P(A) = 0.5 is irrelevant.) Formally: P(A∣B) =

P(A ∩ B) 0.2 1 = = . P(B) 0.6 3

The foregoing examples motivate the following definition: Definition 110. Let P be a probability function and A, B ∈ Σ be events. Then the conditional probability of A given B is denoted P(A∣B) and is defined by: P(A∣B) =

P(A ∩ B) . P(B)

Exercise 223. Roll two dice. Given that the sum of the two dice rolls is 8, what is the probability that we rolled at least one even number? (Answer on p. 1090.)

Page 552, Table of Contents

www.EconsPhDTutor.com

56.1

The Conditional Probability Fallacy (CPF)

Definition 111. The conditional probability fallacy (CPF) is the mistaken belief that P (A∣B) = P (B∣A)

is always true.

Informally, the CPF is the fallacy of leaping from “If A, then probably B”

to

“Since B, then probably A.”

But in general, it is not true that P (A∣B) = P (B∣A). Instead:

Fact 69. (a) If P(A) < P(B), then P (A∣B) < P (B∣A). (b) If P(A) > P(B), then P (A∣B) > P (B∣A).

(c) If P(A) = P(B), then P (A∣B) = P (B∣A). Proof. By definition, P (A∣B) =

Thus, P (A∣B) =

P (A ∩ B) P (B ∩ A) and P (B∣A) = . P(B) P(A)

P (A) P (B∣A). And so, P(B)

P(A) < P(B) Ô⇒ P (A∣B) < P (B∣A) , P(A) > P(B) Ô⇒ P (A∣B) > P (B∣A) , P(A) = P(B) Ô⇒ P (A∣B) = P (B∣A) .

The CPF is also known as the confusion of the inverse or the inverse fallacy. In different contexts, it is also known variously as the base-rate fallacy, false-positive fallacy, or prosecutor’s fallacy.

Page 553, Table of Contents

www.EconsPhDTutor.com

Example 512. Suppose the following statement is true: “If Mary has Ebola, then Mary will probably vomit today.” Formally, we might write P (Vomit∣Ebola) = 0.99.

Mary vomits today. One might then reason, “Since P (Vomit∣Ebola) = 0.99, by the CPF, we also have P (Ebola∣Vomit) = 0.99. Thus, Mary probably has Ebola.”

Formally, this reasoning is flawed because P(Vomit) is probably much larger than P(Ebola). Thus, P (Vomit∣Ebola) is probably much larger than P (Ebola∣Vomit).

Informally, the reasoning is flawed because:

• Ebola is extremely rare, so it is extremely unlikely that Mary has Ebola in the first place. • Besides Ebola, there are many other alternative explanations for why Mary might have vomitted. For example, she might have had motion sickness or food poisoning. Example 513. Sally buys a 4D ticket every week. One day, she wins the first prize. To her astonishment, she wins the first prize again the following week. Her jealous cousin Ah Kow makes a police report, based on the following reasoning: “Without cheating, the probability that Sally wins the first prize two weeks in a row is 1 in 100 million. Given that she did win first prize two weeks in a row, the probability that she didn’t cheat must likewise be 1 in 100 million. In other words, there is almost no chance that Sally didn’t cheat.” Let’s rephrase Ah Kow’s reasoning more formally. Let A and B be the events “Sally wins the first prize two weeks in a row” and “Sally didn’t cheat”, respectively. We know that P (A∣B) = 0.00000001. By the CPF, we have P (A∣B) = P (B∣A). Hence, P (B∣A) = 0.00000001. Equivalently, there is probability 0.99999999 that Sally cheated.

Formally, this reasoning is flawed because P(B) is probably much larger than P (A). Thus, P (B∣A) is probably much larger than P (A∣B). Informally, the reasoning is flawed because:

• Cheating in 4D is extremely rare (and difficult), so it is extremely unlikely that Sally cheated in the first place. • Besides cheating, there are many other alternative explanations for why there exists an individual who won first prize two weeks in a row. One important alternative explanation is that so many individuals buy 4D tickets regularly that there will invariably be someone as lucky as Sally. Suppose that only 100, 000 Singaporeans (less than 2% of Singapore’s population) buy one 4D number every week. Then we’d expect that about once every 20 years, one of these 100, 000 Singaporeans will have the fortune of winning the first prize on consecutive weeks. Rare, but hardly impossible.

Page 554, Table of Contents

www.EconsPhDTutor.com

The next example uses concrete numbers to illustrate how large the discrepancy between P (A∣B) and P (B∣A) can be.

Example 514. A randomly-chosen person is given a free smallpox screening. We know that 1 out of every 1, 000, 000 people has smallpox. The test is very accurate: If you have smallpox, it correctly tells you so 99% of the time. (Equivalently, it gives a false negative only 1% of the time.) And if you don’t have smallpox, it also correctly tells us so 99% of the time. (Equivalently, it gives a false positive only 1% of the time.) Formally, let S, +, and − denote the events “the randomly-chosen person has smallpox”, “the test returns positive”, and “the test returns negative”. Then P (S) =

1 , 1000000

P (+∣S) = 0.99,

P (−∣S C ) = 0.99,

P (S C ) =

999999 , 1000000

P (−∣S) = 0.01,

P (+∣S C ) = 0.01.

The test result returns positive (i.e. it says that the randomly-chosen person has smallpox). What is the probability that this person actually has smallpox? In words, it is easy to confuse “the probability of a positive test result conditional on having smallpox” with “the probability of having smallpox conditional on a positive test result”. Formally, this is the CPF. One starts with P (+∣S) = 0.99 and confusedly concludes that P (S∣+) = 0.99 — this person almost certainly has smallpox.

In fact, as we now show, despite testing positive, the person is very unlikely to have small1 ∗ pox. The correct answer is P (S∣+) ≈ ! In the steps below, each = simply uses the 10, 000 definition of conditional probability (Definition 110): ∗

P (S∣+) = ∗

= =

P (S ∩ +) ∗ P (S) P (+∣S) P (S) P (+∣S) = = P (+) P (+) P (+ ∩ S) + P (+ ∩ S C )

P (S) P (+∣S) P (S) P (+∣S) + P (S C ) P (+∣S C ) 1 1000000 0.99 999999 1 1000000 0.99 + 1000000 0.01

= 0.00009899029 ≈

1 . 10, 000

This example illustrates how far off the CPF can lead one astray. Now an actual, real-world example:

Page 555, Table of Contents

www.EconsPhDTutor.com

Example 515. The British mother who murdered her two babies. In 1996, Sally Clark’s first-born died suddenly within a few weeks of birth. In 1998, the same happened to her second child. Clark was then arrested on suspicion of murdering her babies. At her trial, an “expert” witness claimed that in an affluent, non-smoking family such as Sally Clark’s, the probability of an infant suddenly dying with no explanation was 1/8543. Hence, he concluded, the probability of two sudden infant deaths in the same family was 2 (1/8543) or approximately 1 in 73 million. The “expert” then committed the CPF. He argued that since

P (Two babies suddenly die∣Mother did not murder babies) =

1 , 73, 000, 000

P (Mother did not murder babies∣Two babies suddenly die) =

1 . 73, 000, 000

it therefore follows that

This erroneous reasoning led to Sally Clark being convicted for murdering her two babies. (Some of you may have noticed that the “expert” actually also made another mistake. But we’ll examine this only in the next chapter.) It turns out that not only laypersons and court prosecutors commit the CPF. As we’ll see later, even academic researchers also often commit the CPF, when it comes to interpreting the results of a null hypothesis significance test (Chapter 72). Exercise 224. (Answer on p. 1090.) At a murder scene, a sample of a blood stain is collected. Its DNA is analysed and compared to a database of DNA profiles. A match with one John Brown is found. Say there is only a 1 in 10 million chance that two random individuals have a DNA match. Does this mean that there is probability 1 in 10 million that the DNA match with John Brown is merely a coincidence, and thus a near-certainty that the blood stain is really his? Explain why or why not, with reference to the following conditional probabilities: P (Blood stain is not John Brown’s∣DNA match) ,

P (Blood stain is not John Brown’s∣DNA match) .

Page 556, Table of Contents

www.EconsPhDTutor.com

56.2

Two-Boys Problem (Fun, Optional)

This is a famous puzzle, first popularised by Martin Gardner in 1959. Example 516. Consider all the families in the world that have two children, of whom at least one is a boy. Randomly pick one of these families. What is the probability that both children in this family are boys?

Think about it (set aside this book) before reading the answer below.

We already know that one child is a boy. So intuition might suggest that “obviously”, P (Both boys) = P (The other child is a boy) = 0.5.

Intuition would be wrong. Intuition goes astray by failing to recognise that there are three equally likely ways that a family with two children can have at least one boy: BB, BG, or GB. The answer is in fact 1/3:

P (BB∣At least one boy) =

=

P(BB ∩ ”At least one boy”) P(BB) = P(At least one boy) P(At least one boy) P(BB) = P(BB) + P(BG) + P(GB)

1 4

+

1 4 1 4

+

1 4

1 = . 3

In 2010, the following variant of the above Martin Gardner problem was presented.

Page 557, Table of Contents

www.EconsPhDTutor.com

Example 517. Consider all the families in the world that have two children, of whom at least one is a boy born on a Tuesday. Randomly pick one of these families. What is the probability that both children in this family are boys? Those familiar with the previous problem might think, “Well, this is exactly the same as the two-boys problem, except with an obviously-irrelevant bit of information about the boy being born on a Tuesday. So the answer must be the same as before: 1/3.” It turns out though that, surprisingly, the Tuesday bit of information makes a big difference. The answer is 13/27 = 0.481. This is much closer to 0.5 than to 1/3!

Consider all the “two-child, at-least-one-boy-born-on-a-Tuesday” families in the world. The four mutually-exclusive possibilities are Child #1

Child #2

BT B

Boy born on Tuesday

Boy (born on any day)

BT G

Boy born on Tuesday

Girl

BN BT Boy not born on Tuesday Girl

GBT

Boy born on Tuesday Boy born on Tuesday

Probability P (BT B) =

1 1 7 ⋅ = 14 2 196

P (BT G) =

7 1 1 ⋅ = 14 2 196

P (GBT ) =

7 1 1 ⋅ = 2 14 196

P (BN BT ) =

6 1 6 ⋅ = 14 14 196

Altogether then, amongst two-child families with at least one boy born on a Tuesday, the proportion that have two boys is

= = =

P (BB ∩ ”At least one Tuesday boy”)

P (Both boys, at least one of whom born on Tuesday) P (At least one Tuesday boy)

P (BT B) + P (BN BT ) P (BT B) + P (BT G) + P (BN BT ) + P (GBT ) 7 196

Page 558, Table of Contents

+

7 196 7 196

6 + 196 13 = . 6 7 27 + 196 + 196

www.EconsPhDTutor.com

57

Probability: Independence

Informally, two events A and B are independent if the probability that both occur is simply the product of the probabilities that each occurs. Independence is thus analogous to the MP from counting. Formally: Definition 112. Two events A, B ∈ Σ are independent if

P(A ∩ B) = P(A)P(B).

There is a second, equivalent perspective of independence. Informally, two events A and B are independent if the probability that A occurs is independent of whether B has occurred. Formally: Fact 70. Suppose P(B) ≠ 0. Then A, B are independent events ⇐⇒ P(A∣B) = P(A). Proof. By definition of conditional probabilities, P(A∣B) = P(A ∩ B)/P(B). By definition 2 2 1 of independence, P(A ∩ B) = P(A)P(B). Plugging = into =, we have P(A∣B) = P(A), as desired. 1

Page 559, Table of Contents

www.EconsPhDTutor.com

Example 518. Flip two fair coins. Model this with the usual experiment, where • S = {HH, HT, T H, T T },

• Σ contains 24 = 16 elements, and • P ({HH}) = P ({HT }) = P ({T H}) = P ({T T }) = 1/4.

Let H1 be the event that the first coin flip is Heads — that is, H1 = {HH, HT }. Analogously define T1 , H2 , and T2 .

The intuitive idea of independence is easy to grasp. If we say that the two coin flips are independent, what we mean is that the following four conditions are true: 1. H1 and H2 are independent. (The probability that the second flip is heads is independent of whether the first flip is heads.) 2. H1 and T2 are independent. (The probability that the second flip is tails is independent of whether the first flip is heads.) 3. T1 and H2 are independent. (The probability that the second flip is heads is independent of whether the first flip is tails.) 4. T1 and T2 are independent. (The probability that the second flip is tails is independent of whether the first flip is tails.) Formally: 1. P (H1 ∩ H2 ) = P({HH}) = P (H1 ) P (H2 ) = P({HH, HT })P({HH, T H}) = 0.5 ⋅ 0.5 = 0.25.

2. P (H1 ∩ T2 ) = P({HT }) = P (H1 ) P (T2 ) = P({HH, HT })P({HT, T T }) = 0.5 ⋅ 0.5 = 0.25. 3. P (T1 ∩ H2 ) = P({T H}) = P (T1 ) P (H2 ) = P({T H, T T })P({HH, T H}) = 0.5 ⋅ 0.5 = 0.25.

4. P (T1 ∩ T2 ) = P({T T }) = P (T1 ) P (T2 ) = P({T H, T T })P({HT, T T }) = 0.5 ⋅ 0.5 = 0.25.

Page 560, Table of Contents

www.EconsPhDTutor.com

Example 519. Flip a fair coin and roll a fair die. This can be modelled by an experiment, where • S = {H1, H2, H3, H4, H5, H6, T 1, T 2, T 3, T 4, T 5, T 6} . • Σ consists of 212 events. • P(A) = ∣A∣/12, for any event A ∈ Σ.

Now consider the event “Heads” E1 = {H1, H2, H3, H4, H5, H6}, and the event “Roll an odd number” E2 = {H1, H3, H3, T 1, T 3, T 5}. These two events E1 and E2 are independent, as we now verify: P (E1 ∣E2 ) =

P (E1 ∩ E2 ) 3/12 1 = = = P (E1 ) . P (E2 ) 6/12 2

More broadly, we can even say that the coin flip and die roll are independent. Informally, this means that the outcome of the coin flip has no influence on the outcome of the die roll, and vice versa. The idea of independence is a little tricky to illustrate on a Venn diagram. I’ll try anyway.

Page 561, Table of Contents

www.EconsPhDTutor.com

Example 520. The Venn diagram below illustrates a sample space with 100 equally likely outcomes (represented by 100 small squares). The event A is highlighted in red. The event B is highlighted in blue. P(A) = 0.2 (A is made of 20 small squares). P(B) = 0.1 (B is made of 10 small squares). The event A ∩ B, coloured in green, is made of 2 small squares, so P(A ∩ B) = 0.02. We compute

P(A∣B) =

P(A ∩ B) 0.02 = = 0.2. P(B) 0.1

We observe that P(A) = 0.2 = P(A∣B). And so by Fact 70, we conclude that the events A and B are independent.

Page 562, Table of Contents

www.EconsPhDTutor.com

Exercise 225. Symmetry of Independence. In Fact 70, we showed that “A, B independent ⇐⇒ P(A∣B) = P(A)”. Now prove that “A, B are independent events ⇐⇒ P(B∣A) = P(B).” (Answer on p. 1091.) Exercise 226. (Answer on p. 1091.) An example of a transitive relation is equality: If A = B and B = C, then A = C. Another example is ≤: If A ≤ B and B ≤ C, then A ≤ C. In contrast, independence is not transitive, as this exercise will demonstrate. That is, even if A and B are independent, and B and C are independent, it may not be that A and C are also independent. Flip two fair coins. Let H1 be the event that the first coin flip is heads, H2 be the event that the second is heads, and T1 be the event that the first flip is tails. Show that (a) H1 and H2 are independent. (b) H2 and T1 are independent. (c) H1 and T1 are not independent.

Page 563, Table of Contents

www.EconsPhDTutor.com

57.1

Warning: Not Everything is Independent

The idea of independence is intuitively easy to grasp. Indeed, so much so that students often assume that “everything is independent”. This is a mistake. Unless you’re explicitly told, NEVER assume that two events are independent. Here are two examples where the assumption of independence is plausible: Example 521. The event “coin-flip #1 is heads” and the event “coin-flip #2 is heads” are probably independent. Example 522. The event “die-roll #1 is 3” and the event “die-roll #2 is 6” are probably independent. Here are two examples where the assumption of independence is not plausible: Example 523. The event “Google’s share price rises today” is probably not independent of the event “Apple’s share price rises today”. Example 524. The event “it rains in Singapore today” is probably not independent of the event “it rains in Kuala Lumpur today”. Nonetheless, the assumption of independence is frequently — and incorrectly — made even when it is implausible. One reason is that the maths is easy if we assume independence — we can simply multiply probabilities together. We now revisit the Sally Clark case. Previously, we saw that the court’s “expert” witness committed the CPF. Now, we’ll see that he also made a second mistake — that of assuming independence.

Page 564, Table of Contents

www.EconsPhDTutor.com

Example 525. The “expert” witness claimed that in an affluent, non-smoking family such as Sally Clark’s, the probability of an infant suddenly dying with no explanation was 1/8543. Hence, he concluded, the probability of two sudden infant deaths in the same family was 2 (1/8543) or approximately 1 in 73 million. Can you spot the error in the reasoning?

By simply multiplying together probabilities, the “expert” implicitly assumed that the two events — “sudden death of baby #1” and “sudden death of baby #2” — are independent. But as any doctor will tell you, if your family has a history of heart attack, diabetes, or pretty much any other ailment, then you may be at higher risk (than the average person) of suffering the same. And so, it may well be that in any given year, a random person has probability 0.001 of dying of a heart attack. It does not however follow that in any given year, a random family has probability 0.0012 = 0.000001 of two deaths by heart attack.

Similarly, it may be that if one baby in a family has already suddenly died, a second baby is at higher risk (than the average baby) of suddenly dying.

Exercise 227. (Answer on p. 1091.) Say the probability that a randomly-chosen person is or was an NBA player is one in a million. (This is probably about right, since there’ve only ever been 4, 000 or so NBA players, since the late 1940s.) The Barry family had four players in the NBA — the father Rick Barry and three of his four sons Jon, Brent, and Drew. (The oldest son Scooter didn’t make the NBA but was still good enough to play professionally in other basketball leagues around the world.) A journalist concludes that the probability of a Barry family ever occurring is 4 1 1 ( ) = . 1, 000, 000 1, 000, 000, 000, 000, 000, 000, 000, 000

This is equal to the probability of buying a 4D number on six consecutive weeks, and winning first prize every time. Is the journalist correct?

Page 565, Table of Contents

www.EconsPhDTutor.com

57.2

Probability: Independence of Multiple Events

Definition 113. Let P be a probability function and A, B, C ∈ Σ be events.

A, B, C are pairwise independent if all three of the following conditions are true: P(A ∩ B) = P(A)P(B), P(B ∩ C) = P(B)P(C), P(A ∩ C) = P(A)P(C).

A, B, C are independent if in addition to the above three conditions being true, it is also true that P(A ∩ B ∩ C) = P(A)P(B)P(C).

It is tempting to believe that pairwise independence implies independence. That is, if the first three conditions listed above are true, then so is the fourth. Alas, this is false, as the next exercise demonstrates:

Exercise 228. (Pairwise independence does not imply independence.) (Answer on p. 1091.) Flip two fair coins. Let H1 be the event that the first coin flip is heads, T2 be the event that the second is tails, and X be the event that the two coin flips are different. Show that (a) These three events are pairwise independent. (b) These three events are not independent.

Page 566, Table of Contents

www.EconsPhDTutor.com

58 58.1

Fun Probability Puzzles The Monty Hall Problem

The Monty Hall Problem is probably the world’s most famous probability puzzle. It takes less than a minute to state. Yet its counter-intuitive answer confuses nearly everyone. You’re at a gameshow. There are three boxes, labelled #1, #2, and #3. One box contains one year’s worth of a Singapore minister’s salary. The other two are empty. You are asked to pick one box (but you are not allowed to open it yet). The host, who knows where the minister’s salary is, opens one of the other two boxes, to reveal that it is empty. Important: The host is not allowed to open the box that contains the minister’s salary; he must always open a box that is empty. You’re now given a choice: Stay (with your original choice) or switch (to the other unopened box). What should you do? To illustrate: Example 526. Say you pick Box #2. The host then opens an empty Box #1. You’re now given a choice: Stay (with Box #2) or switch (to Box #3). Which do you choose?

Box #1 Empty

Your original choice

Box #2

Should you switch?

Box #3

Take as long as you need to think about this problem, before turning to the next page for the answer.

Page 567, Table of Contents

www.EconsPhDTutor.com

A magazine columnist named Marilyn vos Savant66 gave the correct answer: Yes; you should switch. The first door has a 1/3 chance of winning, but the second door has a 2/3 chance.

Here are two informal explanations: 1. The probability that the minister’s salary is in the box you picked is 1/3. The probability that the minister’s salary is in either of the other two boxes is 2/3. Of the other two boxes, the gameshow host (who knows where the salary is) helps you eliminate one of them. So the remaining unopened box still has probability 2/3 of containing the minister’s salary. 2. Imagine instead that there are 100 boxes, of which one contains the minister’s salary and the others are empty. You pick one. Of the remaining 99, the gameshow host opens 98. You are again given the choice: Should you stay or switch? In this more extreme version of the game, it is perhaps more obvious that your originally-picked box has only probability 1/100 of containing the minister’s salary, while the only other unopened box has probability 99/100 of the same. Therefore, you should switch. Here’s a more formal explanation using the method of enumeration: 3. Say you originally pick Box #1. There are three possible cases, each occurring with probability 1/3: Box #1 Box #2 Box #3 Host opens Case A Minister’s salary Empty Empty Box #2 or Box #3 B Empty Minister’s salary Empty Box #3 C Empty Empty Minister’s salary Box #2

Not switching wins you the minister’s salary only in Case A (1/3 probability). Switching wins you the minister’s salary in Cases B and C (2/3 probability).

66

Marilyn vos Savant was, briefly, on the Guinness Book of Records as the person with the world’s highest IQ, until Guinness retired this category because IQ tests were considered to be too unreliable.

Page 568, Table of Contents

www.EconsPhDTutor.com

Even with the above explanations, some of you may remain unconvinced. Don’t worry, you are not alone. After Marilyn’s initial response, 10,000 readers sent in letters telling her she was wrong. Some were from Professors of Mathematics and PhDs. A few examples:67 As a professional mathematician, I’m very concerned with the general public’s lack of mathematical skills. Please help by confessing your error and in the future being more careful. There is enough mathematical illiteracy in this country, and we don’t need the world’s highest IQ propagating more. Shame! Maybe women look at math problems differently than men.

Unfortunately for the above letter writers, Marilyn was correct and they were wrong.

The best way to convince the sceptical is through simulations — try this Google spreadsheet. Or if you don’t trust computers, do an actual experiment: Class Activity Form pairs. One person is the gameshow host and the other is the contestant. The host decides where the prize is (Box #1, #2, or #3). The contestant then picks a box. The host then tells the contestant which one of the other two boxes is empty. The contestant then decides whether to stay or switch. Repeat as many times as you have time for. Record the proportion of times that the contestant should have switched. You should find that this proportion is about 2/3.

67

You can read more of these letters at her website.

Page 569, Table of Contents

www.EconsPhDTutor.com

58.2

The Birthday Problem

Example 527. (The birthday problem.) What is the smallest number n of people in a room, such that it is more likely than not, that at least 2 people in the room share the same birthday?*

Fix person #1’s birthday. Then • The probability that person #2’s birthday is different (from person #1) is 364/365. • The probability that person #3’s birthday is different (from persons #1 and #2) is 363/365. • The probability that person #4’s birthday is different (from persons #1, #2, and #3) is 362/365. • ... ... • The probability that person #n’s birthday is different (from persons #1 through #n−1) is (366 − n)/365.

Altogether, the probability that no 2 persons share the same birthday is 366 − n 364 363 362 × × × ⋅⋅⋅ × . 365 365 365 365

Hence, the probability that at least 2 persons share the same birthday is 1−

364 363 362 366 − n × × × ⋅⋅⋅ × . 365 365 365 365

The smallest integer n for which the above probability is at least 0.5 is 23. (Wolfram Alpha.) That is, perhaps surprisingly, with just 23 people, it is more likely than not that at least 2 persons share a birthday. *Assume there are no leap years (every year has 365 days). Also, assume each person’s birthday is equally likely to be on any day of the year and does not depend on the birthday of anyone else.

Page 570, Table of Contents

www.EconsPhDTutor.com

59

Random Variables: Introduction

Informally, a random variable is a function that assigns a real number (you can think of this as a “numerical code”) to each possible outcome s. We call any such real number an observed value of X. Example 528. Model a fair coin-flip with the usual experiment E = (S, Σ, P), where

• S = {H, T }. • Σ = {∅, {H} , {T } , S}.

• P ∶ Σ → R is defined by P (∅) = 0, P ({H}) = P ({H}) = 0.5, and P(S) = 1.

Let X ∶ S → R be the random variable that indicates whether the coin-flip is heads. That is, the observed value of X is X(H) = 1 if the outcome is heads and X(T ) = 0 if the outcome is tails. Formally: Definition 114. Let E = (S, Σ, P) be an experiment. A random variable X (on the experiment E) is any function with domain S and codomain R. Given any random variable X and any outcome s ∈ S, we call X(s) the observed (or realised) value of the random variable X. We often denote a generic observed value X(s) by the lower-case letter x.

Page 571, Table of Contents

www.EconsPhDTutor.com

59.1

A Random Variable vs. Its Observed Values

Students often confuse a random variable with an observed value of the random variable. This confusion is, of course, simply the confusion between a function and the value taken by the function. Example 528 (continued from above). X is a function with domain S and codomain R. X is therefore a random variable. If the outcome of the coin-flip is heads, we do not say that X is 1. Instead, we say that the observed value of X is 1. If the outcome of the coin-flip is tails, we do not say that X is 0. Instead, we say that the observed value of X is 0. Remember: A random variable X is a function that can take on many possible real number values. Each such value x = X(s) is called an observed value of X.

Page 572, Table of Contents

www.EconsPhDTutor.com

X = k Denotes the Event {s ∈ S ∶ X(s) = k}

59.2

Definition 115. Given a random variable X ∶ S → R, the notation “X = k” denotes the event {s ∈ S ∶ X(s) = k}. The notation “X ≥ k”, “X > k”, “X ≤ k”, “X < k”, “a ≤ X ≤ b”, etc. are similarly defined.

Example 528 (continued from above). X(H) = 1 and X(T ) = 0. So we can write: X = 1 denotes the event {s ∈ S ∶ X(s) = 1} = {H} , X = 0 denotes the event {s ∈ S ∶ X(s) = 0} = {T } .

Moreover, P ({H}) = 0.5 and P ({T }) = 0.5. So we can also write: P(X = 1) = 0.5 and P(X = 0) = 0.5.

Now let’s try some other arbitrary number like 13.71. Notice there is no outcome s such that X(s) = 13.71. Thus: X = 13.71 denotes the event {s ∈ S ∶ X(s) = 13.71} = ∅,

and P(X = 13.71) = 0.

Indeed, for any k ≠ 0, 1, there is no outcome s such that X(s) = k. Thus: X = k denotes the event {s ∈ S ∶ X(s) = k} = ∅,

and P(X = k) = 0.

Since P (∅) = 0, we also have P(X = k) = P (∅) = 0, for any k ≠ 0, 1.

Define Y ∶ S → R by Y (H) = 15.5, Y (T ) = 15.5. Y is an example of a constant random variable. We may write: Y = 15.5 denotes the event {s ∈ S ∶ X(s) = 15.5} = {H, T } ,

Moreover, for any k ≠ 15.5,

Y = k denotes the event {s ∈ S ∶ X(s) = k} = ∅,

Page 573, Table of Contents

and P(X = 15.5) = 1.

and P(Y = k) = 0.

www.EconsPhDTutor.com

59.3

The Probability Distribution of a Random Variable

We call a complete specification of P (X = k) for all values of k the probability distribution (or probability law or probability mass function) of X. In the above example, we gave the probability distributions of both X and Y . More examples of random variables and their probability distributions: Example 529. Flip two fair coins. Model this with the usual experiment, where S = {HH, HT, T H, T T }. Let X ∶ S → R indicate whether the two coin flips are the same and Y ∶ S → R count the number of heads. That is, X(HH) = 1, X(HT ) = 0, X(T H) = 0, X(T T ) = 1, Y (HH) = 2, Y (HT ) = 1, Y (T H) = 1, Y (T T ) = 0.

And

P(X = 0) = 0.5, P(X = 1) = 0.5, and P(X = k) = 0, for any k ≠ 0, 1.

P(Y = 0) = 0.25, P(Y = 1) = 0.5, P(Y = 2) = 0.25, and P(X = k) = 0, for any k ≠ 0, 1, 2. Another example:

Page 574, Table of Contents

www.EconsPhDTutor.com

Example 530. Pick a random card from the standard 52-card deck. Model this with the usual experiment, where S = {A«, K«, , . . . , 2«, Aª, Kª, . . . , 2ª, A©, K©, . . . , 2©, A¨, K¨, . . . , 2¨} .

X ∶ S → R is the High Card Point count (used in the game of bridge). I.e.,

Thus,

X(A of any suit) = 4, X(J of any suit) = 1,

P(X = 0) =

P(X = 3) =

for any k ≠ 0, 1, 2, 3, 4.

X(K of any suit) = 3, X(Any other card) = 0.

36 , 52 4 , 52

P(X = 1) = P(X = 4) =

4 , 52 4 , 52

X(Q of any suit) = 2,

P(X = 2) =

4 , 52

P(X = k) = 0,

Y ∶ S → R indicates whether the picked card is a spade (♠). I.e.,

Y (Any ♠) = 1, Y (Any other card) = 0.

Thus, P(Y = 0) =

Page 575, Table of Contents

39 , 52

P(Y = 1) =

13 , 52

P(Y = k) = 0, for any k ≠ 0, 1.

www.EconsPhDTutor.com

Example 531. Roll two fair dice. Model this with the usual experiment, where ⎧ ⎪ ⎪ S=⎨ ⎪ ⎪ ⎩

,...,

,

,

,...,

,...,

,

X ∶ S → R is the sum of the two dice. And so for example, X

⎛ ⎝

⎞ ⎛ = 7 and X ⎠ ⎝

⎫ ⎪ ⎪ ⎬. ⎪ ⎪ ⎭

⎞ = 5. ⎠

The table below says that P (X = 2) = 1/36, because there is only one way the event X = 2 can occur. And P (X = 3) = 2/36, because there are two ways the event X = 3 can occur. You are asked to complete the table in the next exercise. k 2 3

s such that X(s) = k ,

P (X = k) 1 36 2 36

4 5 6 7 8 9 10 11 12 Exercise 229. (Continuation of the above example.) (Answer on p. 1092.) (a) Complete the above table. Consider the event E, described in words as “the sum of the two dice is at least 10”. (b) Write down the event E in terms of X. (c) Calculate P(E).

Page 576, Table of Contents

www.EconsPhDTutor.com

59.4

Random Variables Are Simply Functions

Example 531 (continued from above). Continue with the same the roll-two-fair-dice example, with X again being the random variable that is the sum of the two dice. We had ⎛ ⎝

⎞ ⎛ = 7 and X ⎠ ⎝

⎞ = 5. ⎠

⎛ ⎝

⎞ ⎛ = 10 and Y ⎠ ⎝

⎞ = 4. ⎠

X

Let Y ∶ S → R be the product of the two dice. And so for example, Y

Remember: random variables are simply functions. And thus, we can manipulate random variables just like we manipulate any functions. So for example, consider the function X + Y ∶ S → R. It is also a random variable. We have ⎛ ⎝

⎞ ⎛ = 17 and (X + Y ) ⎠ ⎝

⎛ ⎝

⎞ ⎛ = 70 and (XY ) ⎠ ⎝

⎛ ⎝

⎞ ⎛ = −22 and (4X − 5Y ) ⎠ ⎝

(X + Y )

⎞ = 9. ⎠

Similarly, consider the function XY ∶ S → R. It is also a random variable. We have (XY )

⎞ = 20. ⎠

Finally, consider the function 4X − 5Y ∶ S → R. It is also a random variable. We have (4X − 5Y )

Page 577, Table of Contents

⎞ = 0. ⎠

www.EconsPhDTutor.com

Exercise 230. Continue with the above roll-two-fair-dice example. Let P ∶ S → R be the greater of the two dice. Let Q ∶ S → R be the difference of the two dice. Evaluate the functions P , Q, and P Q at

and

. (Answer on p. 1093.)

Exercise 231. (Answer on p. 1093.) Model a fair die-roll with the usual experiment E = {S, Σ, P}. Define the function X ∶ S → R by the mapping rule X(1) = 1, X(2) = 2, X(3) = 3, X(4) = 4, X(5) = 5, and X(6) = 6.

Is X a random variable on E? Why or why not? If X is indeed a random variable on E, then write down also P(X = k), for all possible k.

Exercise 232. For each of the following real-world scenarios, write down, in precise mathematical notation (i) the experiment E = {S, Σ, P}; (ii) what the random variable X is; and (iii) P(X = k), for all possible k. (Answers on pp. 1093 and 1094.) (a) Flip 4 (fair) coins. Let the random variable X be a count of the number of heads. (b) Roll 3 (fair) dice. Let the random variable X be the sum of the three dice. (Tedious.)

Page 578, Table of Contents

www.EconsPhDTutor.com

60

Random Variables: Independence

Definition 116. Given random variables X ∶ S → R and Y ∶ S → R, the notation “X = x, Y = y” denotes the event {s ∈ S ∶ X(s) = x, Y (s) = y}. Example 532. Flip two fair coins. Model this with the usual experiment where S = {HH, HT, T H, T T }. Let X ∶ S → R indicate whether the two coin flips were the same and Y ∶ S → R count the number of heads. That is, X(HH) = 1, and Y (HH) = 2,

X(HT ) = 0, Y (HT ) = 1,

X(T H) = 0, Y (T H) = 1,

X(T T ) = 1, Y (T T ) = 0.

Then X = 0, Y = 0 is the event that the two coin flips were not the same AND the number of heads was 0. By observation, this event is the empty set. Thus, P (X = 0, Y = 0) = P (∅) = 0.

X = 1, Y = 0 is the event that the two coin flips were the same AND the number of heads was 0. By observation, this event is {T T }. Thus, P (X = 1, Y = 0) = P ({T T }) = 0.25.

Exercise: Verify for yourself that

P (X = 0, Y = 1) = 0.5, P (X = 1, Y = 1) = 0,

P (X = 0, Y = 2) = 0,

Page 579, Table of Contents

P (X = 1, Y = 2) = 0.25.

www.EconsPhDTutor.com

Informally, two random variables are independent if knowing the value of one does not tell us anything about the value of the other. Example 532 (continued from above). Flip two fair coins. We say the two coin-flips are independent. Informally, the outcome of one doesn’t affect the other. Knowing that the first coin-flip is heads tells us nothing about the second coin-flip. A little more formally, let A and B be the random variables indicating whether the first and second coin-flip are heads (respectively). That is, A = 1 if the first coin-flip is heads and A = 0 otherwise; and B = 1 if the second coin-flip is heads and B = 0 otherwise. Then the informal statement “the two coin-flips are independent” may be translated into the formal statement “the random variables A and B are independent”. Informally, knowing the observed value of A tells us nothing about whether B = 0 or B = 1. (And vice versa.) Formally: Definition 117. Given random variables X ∶ S → R and Y ∶ S → R, we say that X and Y are independent if for all x, y, P (X = x, Y = y) = P(X = x)P(Y = y).

Let’s restate the above definition more explicitly. Suppose X can take on values x1 , x2 , . . . , xn and Y can take on values y1 , y2 , . . . , ym . Then to say that X and Y are independent is to say that all of the following n × m pairs of events are independent X = x 1 , Y = y1 , X = x 2 , Y = y1 , ⋮ X = x n , Y = y1 ,

X = x 1 , Y = y2 , X = x 2 , Y = y2 , ⋮ X = xn , Y = y2 ,

... ... ... ...

X = x 1 , Y = ym , X = x 2 , Y = ym , ⋮ X = x n , Y = ym .

Independence between two random variables is thus equivalent to independence between many pairs of events.

Page 580, Table of Contents

www.EconsPhDTutor.com

Example 532 (continued from above). We now verify, in more formal and precise language, that “the two coin-flips are indeed independent”. Again, A and B are the random variables indicating whether the first and second coin-flips are heads (respectively). We now verify that indeed, P (A = a, B = b) = P(A = a)P(B = b) for all possible values of a and b: a = 0, b = 0 a = 1, b = 0 a = 0, b = 1 a = 1, b = 1

P (A = a, B = b) P ({T T }) = 0.25 P ({HT }) = 0.25 P ({T H}) = 0.25 P ({HH}) = 0.25

P(A = a)P(B = b) P ({T H, T T }) P ({HT, T T }) = 0.5 × 0.5, P ({HH, HT }) P ({HT, T T }) = 0.5 × 0.5, P ({T H, T T }) P ({HH, T H}) = 0.5 × 0.5, P ({HH, HT }) P ({HH, T H}) = 0.5 × 0.5.

✓ ✓ ✓ ✓

Exercise 233. Flip two fair coins. Let X ∶ S → R indicate whether the two coin flips were the same and Y ∶ S → R count the number of heads. Are X and Y independent random variables? (Answer on p. 1096.) Earlier we warned against blithely assuming that any two events are independent. Here we can repeat this warning: Unless explicitly told (or you have a good reason), do not assume that two random variables are independent. The assumption of independence is a strong one. There are many scenarios where it is plausible. For example, the flips of two coins are probably independent. The rolls of two dice are probably independent. There are, however, also many scenarios where it is not plausible. Today’s changes in the share prices of Google and Apple are probably not independent. Today’s rainfall in Singapore and in Kuala Lumpur are probably not independent. Nonetheless, the assumption of independence is frequently — and incorrectly — made even when it is implausible. The reason is that the maths is easy if we assume independence — we can simply multiply probabilities together. Unfortunately, incorrectly assuming independence has sometimes had tragic consequences, as we saw in the Sally Clark case.

Page 581, Table of Contents

www.EconsPhDTutor.com

61

Random Variables: Expectation

Example 533. Let X be the outcome of a fair die roll. What is the expected value (or the mean) of X? In other words, on average, what’s the expected outcome of a fair die roll? Note that X takes on a value 1 with probability 1/6. Similarly, it takes on a value 2 with probability 1/6. Etc. Hence, the expected value of X, denoted E [X] is given by: E[X] =

1 1 1 1 1 1 + 2 + 3 + 4 + 5 + 6 21 1 ⋅1+ ⋅2+ ⋅3+ ⋅4+ ⋅5+ ⋅6= = = 3.5. 6 6 6 6 6 6 6 6

E[X] is thus simply a weighted average of the possible values of X, where the weights are the probability weights. We’ll use the following slightly-incorrect definition of a discrete random variable:68 Slightly-Incorrect Definition. A random variable is discrete if its range is finite. That is, a random variable is discrete if it takes on finitely many possible values. We can now formally define the expected value of a discrete random variable: Definition 118. Let E = (S, Σ, P) be an experiment. Then the corresponding expectation operator, denoted E, is the function that maps any discrete random variable X ∶ S → R to a real number, according to the mapping rule E[X] =

P(X = k) ⋅ k. ∑ k∈Range(X)

We call E[X] the expected value (or mean) of X. We often write µX = E[X] or even µ = E[X] (if it is clear from the context that we’re talking about the mean of X).

68

The correct definition is this: A random variable is discrete if its range is finite or countably-infinite. I avoid giving this correct definition because this would require explaining what “countably-infinite” means.

Page 582, Table of Contents

www.EconsPhDTutor.com

Example 534. Let X be the outcome of a fair die roll. The range of X is Range(X) = {1, 2, 3, 4, 5, 6}. So E[X] =

P (X = k) ⋅ k ∑ k∈Range(X)

= P (X = 1) ⋅ 1 + P (X = 2) ⋅ 2 + P (X = 3) ⋅ 3 + P (X = 4) ⋅ 4 + P (X = 5) ⋅ 5 + P (X = 6) ⋅ 6.

=

1 1 1 1 1 1 ⋅ 1 + ⋅ 2 + ⋅ 3 + ⋅ 4 + ⋅ 5 + ⋅ 6 = 3.5. 6 6 6 6 6 6

Example 535. Let Y be the sum of two fair die-rolls. The range of Y is Range(Y ) = {2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12}. In Exercise 229, we worked out that P (Y = 2) = 1/36, P (Y = 3) = 2/36, etc. Thus:

E[Y ] =

P (Y = k) ⋅ k ∑ k∈Range(Y )

= P (Y = 2) ⋅ 2 + P (Y = 3) ⋅ 3 + P (Y = 4) ⋅ 4 + P (Y = 5) ⋅ 5 + ⋅ ⋅ ⋅ + P (Y = 12) ⋅ 12

=

=

1 2 3 4 5 6 5 4 3 2 1 ⋅2+ ⋅3+ ⋅4+ ⋅5+ ⋅6+ ⋅7+ ⋅8+ ⋅9+ ⋅ 10 + ⋅ 11 + ⋅ 12 36 36 36 36 36 36 36 36 36 36 36 2 + 6 + 12 + 20 + 30 + 42 + 40 + 36 + 30 + 22 + 12 252 = = 7. 36 36

Page 583, Table of Contents

www.EconsPhDTutor.com

Example 536. Flip two fair coins and roll two fair dice. Let X be the number of heads and Y be the number of sixes. Problem: What is E[X + Y ]?

As it turns out, it is generally true that E[X + Y ] = E[X] + E[Y ] (as we’ll see in the next section). So if we knew this, then the problem would be very easy: E[X + Y ] = E[X] + E[Y ] = 1 +

1 4 = . 3 3

But as an exercise, let’s pretend we don’t know that E[X + Y ] = E[X] + E[Y ]. We thus have to work out E[X + Y ] the hard way:

First, note that Range(X + Y ) = {0, 1, 2, 3, 4}. P (X + Y = 0) is the probability of 0 heads and 0 sixes. And P (X + Y = 1) is the probability of 1 head and 0 sixes OR 0 heads and 1 six. We can compute: P (X + Y = 0) = P (X + Y = 1) =

1 1 5 5 25 ⋅ ⋅ ⋅ = , 2 2 6 6 144

⎛ 2 ⎞ 1 1 5 5 1 1 ⎛ 2 ⎞ 5 1 50 10 60 ⋅ ⋅ ⋅ + ⋅ = + = . ⎝ 1 ⎠ 2 2 6 6 2 2 ⎝ 1 ⎠ 6 6 144 144 72

You are asked to complete the rest of this problem in the exercise below. Exercise 234. Complete the above example by following these steps: (a) Compute P (X + Y = 2). (b) Compute P (X + Y = 3). (c) Compute P (X + Y = 4). (d) Now compute E[X + Y ]. (Answer on p. 1096.)

Page 584, Table of Contents

www.EconsPhDTutor.com

61.1

The Expected Value of a Constant R.V. is Constant

Example 537. Let 5 be a constant random variable on some experiment E = (S, Σ, P). That is, 5 ∶ S → R is the function defined by s ↦ 5. (Note that the symbol 5 does double duty by denoting both a function and a real number.) Then not surprisingly, Function Number ↓ ↓ E [5] = 5 .

That is, on average, we expect the random variable 5 to take on the value 5. We can easily prove the above observation: Fact 71. If the constant random variable c maps every outcome to the number c, then E[c] = c. Proof. The PMF of the constant random variable c is given by P (c = c) = 1 and P (c = k) = 0 for any k ≠ c. Hence, E [c] = P (c = c) ⋅ c = 1 ⋅ c = c.

Page 585, Table of Contents

www.EconsPhDTutor.com

Exercise 235. In the game of 4D, you pay $1 to pick any four-digit number between 0000 and 9999 (there are thus 10, 000 possible choices). There are two variants of the 4D game — “big” and “small”. The prize structures are as given below. Let X be the prize received from a $1 stake in the “big” game and Y be the prize received from a $1 stake in the “small” game. (Answer on p. 1097.) (a) Write down the range of X and the range of Y . (b) Write down the probability distributions of X and Y . (c) Hence find E[X] and E[Y ]. (d) Which game — “big” or “small” — is expected to lose you less money?

(Source: Singapore Pools, “Rules for the 4-D Game”, Version 1.11, 17/11/15. PDF.)

Page 586, Table of Contents

www.EconsPhDTutor.com

61.2

The Expectation Operator is Linear

Definition 119. Let f ∶ A → B be a function, x, y ∈ A, and k ∈ R. We say that f is a linear transformation if it satisfies the following two conditions: (a) Additivity: f (x + y) = f (x) + f (y); and (b) Homogeneity of degree 1: f (kx) = kf (x).

Example 538. The summation operator ∑ is an example of a linear transformation. Because it satisfies both additivity and homogeneity of degree 1: n

n

n

i=1

i=1

i=1

∑ (ai + bi ) = ∑ ai + ∑ bi

n

n

i=1

i=1

and ∑ (kai ) = k ∑ ai .

d Example 539. The differentiation operator is an example of a linear transformation. dx Because it satisfies both additivity and homogeneity of degree 1: d d d (f (x) + g(x)) = f (x) + g(x) and dx dx dx

d d (kf (x)) = k f (x). dx dx

A common mistake made by students is to believe that “everything is linear”. Here are two examples of operators that are not linear transformations. Example 540. The square-root operator do not have √

x+y =

√

x+

√ √

⋅ is not a linear transformation. In general, we

y

or

√

√ kx = k x.

Example 541. The square operator ⋅2 is not a linear transformation. In general, we do not have (x + y) = x2 + y 2 2

Page 587, Table of Contents

or (kx) = kx2 . 2

www.EconsPhDTutor.com

It turns out that the expectation operator is a linear transformation. Proposition 13. The expectation operator E is linear. That is, if X and Y are random variables and c is a constant, then (a) Additivity: E[X + Y ] = E [X] + E [Y ], (b) Homogeneity of degree 1: E[cX] = cE [X].

Proof. Optional, see p. 916 in the Appendices.

The linearity of the expectation operator is a powerful property, especially because it is true even if independence is not satisfied. Example 542. I stake $100 on each of two different 4D numbers for Saturday’s drawing (“big” game). (So that’s $200 total.) Let X and Y be my winnings (excluding my original stake) from the first and second numbers (respectively). Now, X and Y are certainly not independent because for example, if my first number wins first prize, then my second number cannot possibly also win first prize. Nonetheless, despite X and Y not being independent, the linearity of the expectation operator tells us that E [X + Y ] = E [X] + E [Y ] = $65.90 + $65.90 = $131.80.

Page 588, Table of Contents

www.EconsPhDTutor.com

62

Random Variables: Variance

Example 543. Consider a random variable X that is equally likely to take on one of 5 possible values: 0, 1, 2, 3, 4. Its mean is µX = ∑ P (X = k) ⋅ k =

1 1 1 1 1 ⋅ 0 + ⋅ 1 + ⋅ 2 + ⋅ 3 + ⋅ 4 = 2. 5 5 5 5 5

Now consider another random variable Y that is equally likely to take on one of 5 possible values: −8, −3, 2, 7, 12. Coincidentally, its mean is the same: µY = ∑ P (Y = k) ⋅ k =

1 1 1 1 1 ⋅ (−8) + ⋅ (−3) + ⋅ 2 + ⋅ 7 + ⋅ 12 = 2. 5 5 5 5 5

The random variables X and Y share the same mean. However, there is an obvious difference: Y is “more spread out”. What, precisely, do we mean when we say that one random variable is “more spread out” than another? Our goal in this section is to invent a measure of “spread-outness”. We’ll call this the variance and denote the variance of any random variable X by V [X].

It’s not at all obvious how the variance should be defined. One possibility is to define the variance as the weighted average of the deviations from the mean.

Page 589, Table of Contents

www.EconsPhDTutor.com

Example 589 (continued from above). (Our first proposed definition of variance.) For X, the weighted average of the deviations from the mean is V [X] = ∑ P (X = k) ⋅ (k − µ) 1 1 1 1 1 = ⋅ (0 − µ) + ⋅ (1 − µ) + ⋅ (2 − µ) + ⋅ (3 − µ) + ⋅ (4 − µ) 5 5 5 5 5 1 1 1 1 1 = ⋅ (0 − 2) + ⋅ (1 − 2) + ⋅ (2 − 2) + ⋅ (3 − 2) + ⋅ (4 − 2) 5 5 5 5 5 2 1 1 2 = − − + 0 + + = 0. 5 5 5 5

Hmm. This works out to be 0. Is that just a weird coincidence? Let’s try the same for Y : V [Y ] = ∑ P (Y = k) ⋅ (k − µ) 1 1 1 1 1 = ⋅ (−8 − µ) + ⋅ (−3 − µ) + ⋅ (2 − µ) + ⋅ (7 − µ) + ⋅ (12 − µ) 5 5 5 5 5 1 1 1 1 1 = ⋅ (−8 − 2) + ⋅ (−3 − 2) + ⋅ (2 − 2) + ⋅ (7 − 2) + ⋅ (12 − 2) 5 5 5 5 5 = −2 − 1 + 0 + 1 + 2 = 0.

Hmm. Again it works out to be 0.

This is no mere coincidence. It turns out that ∑ P(X = k) ⋅ (k − µ) is always equal to 0. k

This is because

=µ

³¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ·¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ µ ∑ P(X = k) ⋅ (k − µ) = ∑ P(X = k) ⋅ k − ∑ P(X = k) ⋅ µ k

k

k

= µ − µ∑ P(X = k) = 0. ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶ k

=1

So our first proposed definition of the variance — the weighted average of the deviations from the mean — is always equal to 0. Intuitively, the reason is that the negative deviations (corresponding to those values below the mean) exactly cancel out the positive deviations (corresponding to those values above the mean). This proposed definition is thus quite useless. We cannot use it to say things like Y is “more spread out” than X. This suggests a second approach: define the variance to be the weighted average of the absolute deviations from the mean.

Page 590, Table of Contents

www.EconsPhDTutor.com

Example 589 (continued from above). (Our second proposed definition of variance.) For X, the weighted average of the absolute deviations from the mean is V [X] = ∑ P (X = k) ⋅ ∣k − µ∣ 1 1 1 1 1 = ⋅ ∣0 − µ∣ + ⋅ ∣1 − µ∣ + ⋅ ∣2 − µ∣ + ⋅ ∣3 − µ∣ + ⋅ ∣4 − µ∣ 5 5 5 5 5 1 1 1 1 1 = ⋅ ∣0 − 2∣ + ⋅ ∣1 − 2∣ + ⋅ ∣2 − 2∣ + ⋅ ∣3 − 2∣ + ⋅ ∣4 − 2∣ 5 5 5 5 5 2 1 1 2 6 = + +0+ + = . 5 5 5 5 5

And now let’s work out the same for Y :

V [Y ] = ∑ P (Y = k) ⋅ (k − µ) 1 1 1 1 1 = ⋅ ∣−8 − µ∣ + ⋅ ∣−3 − µ∣ + ⋅ ∣2 − µ∣ + ⋅ ∣7 − µ∣ + ⋅ ∣12 − µ∣ 5 5 5 5 5 1 1 1 1 1 = ⋅ ∣−8 − 2∣ + ⋅ ∣−3 − 2∣ + ⋅ ∣2 − 2∣ + ⋅ ∣7 − 2∣ + ⋅ ∣12 − 2∣ 5 5 5 5 5 = 2 + 1 + 0 + 1 + 2 = 6.

Wonderful! So we can now use this second proposed definition of the variance to say things like “Y is more spread out than X”. This second proposed definition seems perfectly satisfactory. Yet for some bizarre reason, we won’t use it! Instead, we’ll define the variance to be the weighted average of the squared deviations from the mean.

Page 591, Table of Contents

www.EconsPhDTutor.com

Example 589 (continued from above). (The actual definition of variance.) For X, the weighted average of the squared deviations from the mean is V [X] = ∑ P (X = k) ⋅ (k − µ) 1 1 1 1 1 2 2 2 2 2 = ⋅ (0 − µ) + ⋅ (1 − µ) + ⋅ (2 − µ) + ⋅ (3 − µ) + ⋅ (4 − µ) 5 5 5 5 5 1 1 1 1 1 2 2 2 2 2 = ⋅ (0 − 2) + ⋅ (1 − 2) + ⋅ (2 − 2) + ⋅ (3 − 2) + ⋅ (4 − 2) 5 5 5 5 5 4 1 1 4 = + + 0 + + = 2. 5 5 5 5 2

And now let’s work out the same for Y :

V [Y ] = ∑ P (Y = k) ⋅ (k − µ) 1 1 1 1 1 2 2 2 2 2 = ⋅ (−8 − µ) + ⋅ (−3 − µ) + ⋅ (2 − µ) + ⋅ (7 − µ) + ⋅ (12 − µ) 5 5 5 5 5 1 1 1 1 1 2 2 2 2 2 = ⋅ (−8 − 2) + ⋅ (−3 − 2) + ⋅ (2 − 2) + ⋅ (7 − 2) + ⋅ (12 − 2) 5 5 5 5 5 = 20 + 5 + 0 + 5 + 20 = 50. 2

Formally, Definition 120. Let µ = E [X]. Then the variance operator is denoted V and is the function that maps each random variable X to a real number c, given by the mapping rule V[X] = E [(X − µ) ] . 2

2 We call V[X] the variance of X. This is often also instead written as σX or even more 2 simply as σ (if it is clear from the context that we’re talking about the variance of X).

So to calculate the variance, we do this: Consider all the possible values that X can take. Take the difference between these values and the mean of X. Square them. Then take the probability-weighted average of these squared numbers. More examples:

Page 592, Table of Contents

www.EconsPhDTutor.com

Example 544. Let the random variable X be the outcome of the roll of a fair die. We already know that µ = 3.5. Hence, V[X] = E [(X − µ) ] = E [(X − 3.5) ] 2

2

= P (X = 1) ⋅ (1 − 3.5)2 + P (X = 2) ⋅ (2 − 3.5)2 + ⋅ ⋅ ⋅ + P (X = 6) ⋅ (6 − 3.5)2 =

35 1 (2.52 + 1.52 + 0.52 + 0.52 + 1.52 + 2.52 ) = ≈ 2.92. 6 12

35 So the variance of the die roll is ≈ 2.92. This means that the expected squared deviation 12 35 of X from its mean µ = 3.5 is ≈ 2.92. 12

Example 545. Roll two fair dice. Let the random variable Y be the sum of the two dice. We already know from Example 535 that µ = 7. So, using also our findings from Exercise 229, V[Y ] = E [(Y − µ) ] = E [(Y − 7) ] 2

2

= P (Y = 2) ⋅ (2 − 7)2 + P (Y = 3) ⋅ (3 − 7)2 + ⋅ ⋅ ⋅ + P (Y = 12) ⋅ (12 − 7)2

=

=

1 2 2 2 3 2 4 2 5 2 6 2 5 2 ⋅5 + ⋅4 + ⋅3 + ⋅2 + ⋅1 + ⋅0 + ⋅1 36 36 36 36 36 36 36 4 2 3 2 2 2 1 2 + ⋅2 + ⋅3 + ⋅4 + ⋅5 36 36 36 36

2 (25 + 32 + 27 + 16 + 5) 210 70 = = ≈ 5.83. 36 36 12

70 ≈ 5.83. This means that on average, the square 12 70 of the deviation of Y from its mean µ = 7 is ≈ 5.83. 12

So the variance of the sum of two dice is

As the above examples suggest, calculating the variance can be tedious. Fortunately, there is a shortcut:

Page 593, Table of Contents

www.EconsPhDTutor.com

Fact 72. Let X be a random variable with mean µ. Then V[X] = E [X 2 ] − µ2 . Proof. Using the definition of variance, the linearity of the expectation operator (Proposition 13), and the fact that µ is a constant, we have V[X] = E [(X − µ) ] = E [X 2 + µ2 − 2Xµ] = E [X 2 ] + E [µ2 ] − 2E [Xµ] 2

= E [X 2 ] + µ2 − 2µE [X] = E [X 2 ] + µ2 − 2µ ⋅ µ = E [X 2 ] − µ2 .

We now redo the previous two examples using this shortcut:

Example 544 (continued from above). Let the random variable X be the outcome of the roll of a fair die. We already know that µ = 3.5. So compute E [X 2 ] = P (X = 1) ⋅ 12 + P (X = 2) ⋅ 22 + ⋅ ⋅ ⋅ + P (X = 6) ⋅ 62 =

Hence, V[X] = E [X 2 ] − µ2 =

91 182 147 35 − 3.52 = − = . 6 12 12 12

91 1 2 2 (1 + 2 + ⋅ ⋅ ⋅ + 62 ) = . 6 6

Example 545 (continued from above). Let the random variable Y be the sum of two rolled dice. We already know from Example 535 that µ = 7. So, using also our findings from Exercise 229,

=

=

E [Y 2 ] = P (Y = 2) ⋅ 22 + P (Y = 3) ⋅ 32 + ⋅ ⋅ ⋅ + P (Y = 12) ⋅ 122

1 2 2 2 3 2 1 ⋅2 + ⋅3 + ⋅ 4 + ⋅⋅⋅ + ⋅ 122 36 36 36 36

4 + 18 + 48 + 100 + 294 + 320 + 324 + 300 + 242 + 144 1974 658 = = . 36 36 12

Hence, V[Y ] = E [Y 2 ] − µ2 =

658 588 70 658 − 72 = − = . 12 12 12 12

This is still tedious, but arguably quicker than before.

Exercise 236. Let the random variable Z be the sum of three rolled dice. Find V[Z]. (Answer on p. 1098.)

Page 594, Table of Contents

www.EconsPhDTutor.com

62.1

The Variance of a Constant R.V. is 0

A constant random variable cannot vary. So not surprisingly, the variance of a constant random variable is 0. Fact 73. Let c be a constant random variable (i.e. it maps every outcome to the real number c). Then V[c] = 0. Proof. Use Fact 594: V [c] = E [c2 ] − (E [c]) = c2 − c2 = 0. 2

Page 595, Table of Contents

www.EconsPhDTutor.com

62.2

Standard Deviation

Let X be a random variable. Then E [X] has the same unit of measure as X. In contrast, V [X] uses the squared unit.

Example 546. There are 100 dumbbells in a gym, of which 30 have weight 5 kg and the remaining 70 have weight 10 kg. Let X be the weight of a randomly-chosen dumbbell. Then the mean of X is E [X] = µ = 0.3 × 5 kg + 0.7 × 10 kg = 8.5 kg.

And the variance of X is

V [X] = 0.3 × (5 kg − 8.5 kg) + 0.7 × (10 kg − 8.5 kg) 2

2

= 0.3 × 12.25 kg2 + 0.7 × 2.25 kg2 = 5.25 kg2 .

To get a measure of “spread” that uses the original unit of measure, we simply take the square root of the variance. This is called the standard deviation as a measure of spread. Definition 121. Let X be a random variable and V[X] be its variance. Then the standard deviation of X is defined as SD [X] =

√

V[X].

2 The variance of a random variable X is often denoted σX or even more simply as σ 2 (if it is clear from the context that we’re talking about the variance of X).

Correspondingly, the standard deviation of X is often denoted σX or σ. Example 596 (continued from above). We calculated the variance of X to be V [X] = σ 2 = 5.25 kg2 . √ Hence, the standard deviation of X is simply σ = 5.25 ≈ 2.29 kg. Exercise 237. There are 100 rulers in a bookstore, of which 35 have length 20 cm and the remaining 65 have length weight 30 cm. Let Y be the weight of a randomly-chosen dumbbell. Find the mean, variance, and standard deviation of Y . (Be sure to include the units of measurement.)(Answer on p. 1098.)

Page 596, Table of Contents

www.EconsPhDTutor.com

62.3

The Variance Operator is Not Linear

The variance operator is not linear. However, given independence, the variance operator does satisfy additivity and homogeneity of degree 2. Proposition 14. Let X and Y be independent random variables and c be a constant. Then (a) Additivity: V[X + Y ] = V [X] + V [Y ], (b) Homogeneity of degree 2: V[cX] = c2 V [X]. Proof. Optional, see p. 917 in the Appendices.

With the above, it becomes much easier than before to find the variance of the sum of 2 dice, 3 dice, or indeed n dice.

Page 597, Table of Contents

www.EconsPhDTutor.com

35 . 12 Now roll two fair dice. Let X1 and X2 be the respective outcomes. Let Y be the sum of the two dice (i.e. Y = X1 + X2 ). Assuming independence, we have

Example 547. Let X be the outcome of a fair die-roll. We showed earlier that V[X] = V[Y ] = V [X1 + X2 ] = V [X1 ] + V [X2 ] =

70 . 12

Compare this quick computation to the work we did in Example 545! Now roll three fair dice. Let X3 , X4 , and X5 be the respective outcomes. Let Z be the sum of the three dice (i.e. Z = X3 + X4 + X5 ). Again, assuming independence, we have V[Z] = V [X3 + X4 + X5 ] = V [X3 ] + V [X4 ] + V [X5 ] =

105 . 12

Again, compare this quick computation to the work you had to do in Exercise 236! Now, let A be double the outcome of a die roll (i.e. A = 2X). Note importantly that A ≠ Y . Y is the sum of two independent die rolls. In contrast, A is double the outcome of a single die roll. Indeed, by Proposition 14, we see that V[A] = V[2X] = 4V[X] =

140 ≠ V[Y ]. 12

V[B] = V[3X] = 9V[X] =

315 ≠ V[Z]. 12

Similarly, let B be triple the outcome of a die roll (i.e. B = 3X). Note importantly that B ≠ Z. Z is the sum of three independent die rolls. In contrast, B is triple the outcome of a single die roll. Indeed, by Proposition 14, we see that

Exercise 238. The weight of a fish in a pond is a random variable with mean µ kg and variance σ 2 kg2 . (Include the units of measurement in your answers.) (Answer on p. 1098.) (a) If two fish are caught and the weights of these fish are independent of each other, what are the mean and variance of the total weight of the two fish? (b) If one fish is caught and an exact clone is made of it, what are the mean and variance of the total weight of the fish and its clone? (c) If two fish are caught and the weights of these fish are not independent of each other, what are the mean and variance of the total weight of the two fish?

Page 598, Table of Contents

www.EconsPhDTutor.com

62.4

The Definition of the Variance (Optional)

Why is the variance defined as the weighted average of squared deviations from the mean? 1. First, we tried defining the variance as the weighted average of deviations from the mean, i.e. V[X] = E [X − µ]. But this was no good, because this quantity would always be equal to 0.69 2. Next, we tried defining the variance as the weighted average of absolute deviations from the mean, i.e. V[X] = E [∣X − µ∣]. This seemed to work well enough. But yet for some bizarre reason, we choose not to use this definition. 3. Instead, we choose to use this definition: V[X] = E [(X − µ) ] . 2

Why do we prefer using squared (rather than absolute) deviations as our definition of variance? The conventional view is that the squared deviations definition is superior to the absolute deviations definition (but see Gorard (2005) and Taleb (2014) for dissenting views). Here are some reasons for believing the squared deviations definition to be superior: • The maths works out more nicely. For example: – The algebra is easier when dealing with squares than with absolute values. – Differentiation is easier (serve that x2 is differentiable but ∣x∣ is not). – Variances are additive: If X and Y are independent, then V [X + Y ] = V [X] + V [Y ]. In contrast, if we use the definition V[X] = E [∣X − µ∣], then variances are no longer additive.

• Tradition (inertia).

– A century or two ago, some Europeans preferred using squared to absolute deviations. And so we’re stuck with using this. See also these Stack Exchange Q&A discussions: [1], [2], [3], [4], and [5].

69

This is easily proven: E [X − µ] = E [X] − E [µ] = µ − µ = 0.

Page 599, Table of Contents

www.EconsPhDTutor.com

63

The Coin-Flips Problem (Fun, Optional)

Here’s another example of a probability problem that can be stated very simply, yet have counter-intuitive results. Example 548. Keep flipping a fair coin until you get a sequence of HH (two heads in a row). Let X be the number of flips taken. Now, keep flipping a fair coin until you get a sequence of HT . Let Y be the number of flips taken. Which is larger µX = E [X] or µY = E [Y ]?

Intuition might suggest that “obviously”, µX = µY . Intuition would be wrong. It turns out that, surprisingly enough, µX = 6 and µY = 4! Example 549. Now suppose we flip a fair coin 10, 001 times. This gives us a sequence of 10, 000 pairs of consecutive coin-flips. For example, if the 10, 001 coin-flips are HHTHT . . . , then the first four pairs of consecutive coin-flips are HH, HT, TH, and HT . Let A be the proportion of the 10, 000 consecutive coin-flips that are HH. Let B be the proportion of the 10, 000 consecutive coin-flips that are HT . Which is larger µA = E [A] or µB = E [B]?

In the previous example, we saw that it took, on average, 6 flips before getting HH and 4 flips before getting HT . So “obviously”, we’d expect a smaller proportion to be HH’s. That is, µA < µB . Sadly, we would again be wrong! It turns out that µA = µB = 1/4! This Google spreadsheet simulates 10, 001 coin-flips and calculates A and B.

If you’re interested, the results given in the above two examples are formally proven in Fact 103 in the Appendices.

Page 600, Table of Contents

www.EconsPhDTutor.com

64

The Bernoulli Trial and the Bernoulli Distribution

A Bernoulli trial is an experiment (S, Σ, P). A coin flip is an example of a Bernoulli trial. Example 550. Flip a coin. We can model this with a Bernoulli trial with probability of success (heads) 0.5: • Sample space S = {T, H},

• Event space Σ = {∅, {T }, {H}, S}, • Probability function P({T }) = 0.5 and P({H}) = 0.5.

The corresponding Bernoulli random variable is simply the random variable X ∶ S → R defined by X ({T }) = 0 and X ({H}) = 1. Its probability distribution is given by P (X = 0) = 0.5 and P(X = 1) = 0.5. Formally:

Definition 122. A Bernoulli trial with probability of success p is an experiment (S, Σ, P) where • S = {0, 1}. (The sample space contains 2 elements.)

• Σ = {∅, {0}, {1}, S}. • P ∶ Σ → R is defined by P({0}) = 1 − p and P({1}) = p. (And as usual P (∅) = 0 and P (S) = 1.)

The corresponding Bernoulli random variable is simply the random variable X ∶ S → R defined by X ({0}) = 0 and X ({1}) = 1. Its probability distribution is given by P (X = 0) = 1 − p and P(X = 1) = p. Note that we can denote the two elements of the sample space with any symbols. We could use 0 — standing for failure — and 1 — standing for success. Or we could use T and H, as was done in the example above.

Page 601, Table of Contents

www.EconsPhDTutor.com

Example 551. On any given day, our refrigerator at home has probability 0.001 of breaking down. We can model this with a Bernoulli trial with probability of success 0.001: • Sample space S = {0, 1},

• Event space Σ = {∅, {0}, {1}, S}, • Probability function P({0}) = 0.999 and P({1}) = 0.001.

The corresponding Bernoulli random variable is simply the random variable T ∶ S → R defined by T ({0}) = 0 and T ({1}) = 1.

Its probability distribution is given by P (T = 0) = 0.999 and P(T = 1) = 0.001. In words, the probability of no failure is 0.999 and the probability of a failure is 0.001. Example 552. 90% of H2 Maths students pass their H2 Maths A-level exams. We randomly pick a H2 Maths student and see if she passes her H2 Maths A-level exam. We can model this with a Bernoulli trial with probability of success 0.9: • Sample space S = {F, P }, • Event space Σ = {∅, {F }, {P }, S},

• Probability function P({F }) = 0.1 and P({P }) = 0.9.

The corresponding Bernoulli random variable is simply the random variable Y ∶ S → R defined by Y ({F }) = 0 and Y ({P }) = 1. Its probability distribution is given by P (Y = 0) = 0.1 and P(Y = 1) = 0.9. The following two statements are equivalent:

1. T is a Bernoulli random variable with probability of success p. 2. The random variable T has Bernoulli distribution with probability of success p.

Page 602, Table of Contents

www.EconsPhDTutor.com

64.1

Mean and Variance of the Bernoulli Random Variable

Fact 74. A Bernoulli random variable T with probability of success p has mean p and variance p(1 − p). Proof. E[T ] = P (T = 0) ⋅ 0 + P (T = 1) ⋅ 1 = (1 − p) ⋅ 0 + p ⋅ 1 = p.

For the variance, first compute

E [T 2 ] = P (T = 0) ⋅ 02 + P (T = 1) ⋅ 12 = (1 − p) ⋅ 0 + p ⋅ 12 = p.

Hence, V [T ] = E [T 2 ] − (E[T ]) = p − p2 = p(1 − p). 2

Page 603, Table of Contents

www.EconsPhDTutor.com

65

The Binomial Distribution

Informally, the binomial random variable simply counts the number of successes in a sequence of n identical, but independent Bernoulli trials. Example 553. Flip 3 fair coins. Let X be the number of heads. 1 X is an example of a binomial random variable X with parameters 3 and . 2 X can take on values 0, 1, 2, or 3 (corresponding to the number of heads). The probability distribution of X is given by: ⎛3⎞ 1 0 1 3 1 P(X = 0) = ( ) ( ) = , 2 8 ⎝0⎠ 2

⎛3⎞ 1 2 1 1 3 P(X = 2) = ( ) ( ) = , 2 8 ⎝2⎠ 2

⎛3⎞ 1 1 1 2 3 P(X = 1) = ( ) ( ) = , 2 8 ⎝1⎠ 2 ⎛3⎞ 1 3 1 0 1 P(X = 3) = ( ) ( ) = . 2 8 ⎝3⎠ 2

Formally: Definition 123. Let T1 , T2 , . . . , Tn be n identical, but independent Bernoulli random variables, each with probability of success p. Then the binomial random variable X with parameters n and p is defined as: X = T1 + T2 + ⋅ ⋅ ⋅ + Tn .

The following three statements are entirely equivalent:

1. X is a binomial random variable with parameters n and p. 2. The random variable X has the binomial distribution with parameters n and p. 3. X ∼ B(n, p).

Page 604, Table of Contents

www.EconsPhDTutor.com

Example 554. 90% of H2 Maths students pass their A-level exams. Let Y be the number of passes among two randomly-chosen students. Then Y is a binomial random variable with parameters 2 and 0.9. Its probability distribution is given by: P (Y = 0) = P (Y = 1) = P (Y = 2) =

⎛2⎞ 0 2 0.9 0.1 = 0.01, ⎝0⎠ ⎛2⎞ 1 1 0.9 0.1 = 0.18, ⎝1⎠

⎛2⎞ 2 0 0.9 0.1 = 0.81. ⎝2⎠

In words, the probability that both fail is 0.01, the probability that exactly one passes is 0.18, and the probability that both pass is 0.81.

Page 605, Table of Contents

www.EconsPhDTutor.com

65.1

Probability Distribution of the Binomial R.V.

Let X ∼ B (n, p). What is P(X = k)?

Observe that P(X = k) is simply the probability that in a sequence of n independent Bernoulli trials, each with probability of success p, there are exactly k successes.

First consider instead the probability that in a sequence of n trials, the first k trials are successes and the remaining n − k are failures. We know that the probability of a success is p and the probability of a failure is 1 − p. Hence, by the Multiplication Principle, this probability is simply pk (1 − p)n−k . The above is the probability of k successes and n − k failures, but where exactly the first k trials are successes and exactly the last n − k trials are failures. But we don’t care about where the successes are. We only care that there are k successes. And there are C(n, k) ways to have exactly k successes in n trials. Thus, P(X = k) = In summary:

⎛n⎞ k p (1 − p)n−k . ⎝k ⎠

Fact 75. Let X ∼ B(n, p). Then for any k = 0, 1, . . . , n, P(X = k) =

⎛n⎞ k p (1 − p)1−k . ⎝k ⎠

Example 555. Let X be the number of heads when 10 fair coins are flipped. Then X ∼ B(10, 0.5). And the probability that exactly 8 coins are heads is: P(X = 8) =

⎛ 10 ⎞ 8 2 45 0.5 0.5 = . 1024 ⎝ 8 ⎠

Example 556. 90% of H2 Maths students pass their A-level exams. Let Y be the number of passes among 20 randomly-chosen students. Then Y ∼ B(20, 0.9). And the probability that at least 18 pass is P(Y ≥ 18) = P(Y = 18) + P(Y = 19) + P(Y = 20) =

⎛ 20 ⎞ 18 2 ⎛ 20 ⎞ 19 1 ⎛ 20 ⎞ 20 0 0.9 0.1 + 0.9 0.1 + 0.9 0.1 ≈ 0.677. ⎝ 18 ⎠ ⎝ 19 ⎠ ⎝ 20 ⎠

Page 606, Table of Contents

www.EconsPhDTutor.com

65.2

The Mean and Variance of the Binomial Random Variable

Example 557. Problem: Three machines each have, independently, probability 0.3 of failure. What is the expected number of failures? What is the variance of the number of failures? Solution: Let Z ∼ B(3, 0.3) be the number of failures. Then P (Z = 1) =

⎛3⎞ 1 2 0.3 0.7 , ⎝1⎠

P (Z = 2) =

⎛3⎞ 2 1 0.3 0.7 , ⎝2⎠

P (Z = 3) =

Hence, E[Z] = P (Z = 1) ⋅ 1 + P (Z = 2) ⋅ 2 + P (Z = 3) ⋅ 3

⎛3⎞ 3 0 0.3 0.7 . ⎝3⎠

⎛3⎞ 2 1 ⎛3⎞ 3 0 ⎛3⎞ 1 2 0.3 0.7 ⋅ 1 + 0.3 0.7 ⋅ 2 + 0.3 0.7 ⋅ 3 ⎝2⎠ ⎝3⎠ ⎝1⎠ = 0.441 + 0.378 + 0.081 = 0.9. =

That is, the expected number of failures is 0.9.

Now,E [Z 2 ] = P (Z = 1) ⋅ 12 + P (Z = 2) ⋅ 22 + P (Z = 3) ⋅ 32

⎛3⎞ 1 2 2 ⎛3⎞ 2 1 2 ⎛3⎞ 3 0 2 0.3 0.7 ⋅ 1 + 0.3 0.7 ⋅ 2 + 0.3 0.7 ⋅ 3 ⎝1⎠ ⎝2⎠ ⎝3⎠ = 0.441 + 0.756 + 0.243 = 1.44. =

Hence, V[Z] = E [Z 2 ] − (E [Z]) = 1.44 − 0.92 = 0.63. 2

That is, the variance of the number of failures is 0.63.

It turns out though that there is a much quicker formula for finding the mean and variance of any binomial random variable.

Page 607, Table of Contents

www.EconsPhDTutor.com

Fact 76. If X ∼ B(n, p), then E[X] = np and V[X] = np(1 − p).

(You can verify that this formula works for the last example: n = 3, p = 0.3, and thus E[Z] = np = 0.9.)

Proof. Let T1 , T2 , . . . , Tn be identical, but independent Bernoulli random variables with parameter p. Then X = T1 + T2 + ⋅ ⋅ ⋅ + Tn . Hence, E[X] = E [T1 + T2 + ⋅ ⋅ ⋅ + Tn ] = E [T1 ] + E [T2 ] + ⋅ ⋅ ⋅ + E [Tn ] = p + p + ⋅ ⋅ ⋅ + p = np. V[X] = V [T1 + T2 + ⋅ ⋅ ⋅ + Tn ] = V [T1 ] + V [T2 ] + ⋅ ⋅ ⋅ + V [Tn ] = p(1 − p) + p(1 − p) + ⋅ ⋅ ⋅ + p(1 − p) = np(1 − p). Exercise 239. (Answer on p. 1099.) Plane engine #1 contains 20 components, each of which has probability 0.01 of failure. Plane engine #2 contains 35 components, each of which has probability 0.005 of failure. The probability that any component fails is independent of whether any other component has failed. An engine fails if and only if at least 2 of its components fail. What is the probability that both engines fail?

Page 608, Table of Contents

www.EconsPhDTutor.com

66

The Poisson Distribution SYLLABUS ALERT

The Poisson distribution is in the 9740 (old) syllabus, but not in the 9758 (revised) syllabus. So you can skip this chapter if you’re taking 9758. The Poisson process is the continuous time analogue of the Bernoulli process.70 And in parallel, the Poisson random variable is the limit of the binomial random variable. Example 558. The long-term average number of murders per year in Singapore is 2.4. How might we model the rate at which murders are committed in Singapore? Let’s assume that the rate at which murders are committed satisfies two properties: 1. (Time-homogeneity.) The probability that there are k murders in any fixed time interval is constant. For example, the probability that there are 2 murders in the first 90 days of the year, is the same as the probability that there are 2 murders in the last 90 days of the year. As another example, the probability that there is 1 murder on January 10th is the same as the probability that there is 1 murder on August 5th. 2. (Independence.) The probability that there is a murder at any given moment does not depend on the number of murders that have already been committed that year. For example, the probability that there is a murder in December does not depend on how many murders were committed between January and November. Then an appropriate model might be the Bernoulli process. Let us say that each month, there is a murder with probability 2.4/12 = 0.2, and no murder with probability 0.8. The number of murders each month may thus be modelled by a Bernoulli random variable T with parameter 0.2. By assumption, the number of murders in one month has no influence on the number of murders in another month. Thus, the number of murders in a given year can be modelled by the binomial random variable X ∼ B(12, 0.2). Equivalently, X = TJan + TFeb + ⋅ ⋅ ⋅ + TDec .

(... Example continued on the next page ...) 70

The Poisson process is an infinite process that is beyond the scope of the A-levels, and is thus omitted from this book.

Page 609, Table of Contents

www.EconsPhDTutor.com

(... Example continued from the previous page ...) (Notice the number 0.2 was chosen so that E[X] = np = 12×0.2 = 2.4 matches the long-term average number of murders per year.) This model is reasonably good, but suffers from at least two flaws: It implicitly assumes that • In any given month, there can be at most 1 murder; and • In any given year, there can be at most 12 murders. These two implicit assumptions are somewhat unrealistic. In the above model, what we did was to partition the year into 12 time intervals. If we instead partitioned the year into 365 time intervals, the above two implicit assumptions would be relaxes. Let’s say that each day, there is probability 2.4/365 ≈ 0.00658 of a murder and probability 1−2.4/365 ≈ 0.99342 of no murders. The number of murders each day may thus be modelled by a Bernoulli random variable U with parameter 2.4/365. Thus, the number of murders in a given year can be modelled by the binomial random 2.4 ). Equivalently, variable Y ∼ B (365, 365 Y = U1 + U2 + ⋅ ⋅ ⋅ + U365 .

2.4 2.4 was deliberately chosen so that E[X] = np = 365 × = 2.4 365 365 matches the long-term average number of murders per year.) (Again, the number

2.4 ) is probably better than the first model X ∼ B(12, 0.2). 365 But why stop at partitioning the year into 365 days? This second model Y ∼ B (365,

In general, we can model the number of murders by the binomial random variable Z ∼ 2.4 B (n, ). Taking the above reasoning to the extreme, we can instead partition the year n into infinitely-many infinitely-short time intervals. That is, we can let n → ∞. And as it turns out, as n → ∞, the binomial random variable Z approaches something called the Poisson random variable with parameter 2.4. That is, lim Z ∼ Po(2.4).

n→∞

Page 610, Table of Contents

www.EconsPhDTutor.com

66.1

Formal Definition of the Poisson Random Variable

Definition 124. Y is called a Poisson random variable with parameter λ if Y is a random variable that satisfies the following two properties: • Range(Y ) = {0, 1, 2, 3, . . . } = Z+0 .

• Y has probability distribution given by P(Y = k) =

λk e−λ , for all k ∈ Z+0 . k!

The following result establishes that the limit of a binomial random variable is a Poisson random variable.71 λ Theorem 13. Let λ > 0. Let Xn ∼ B (n, ). Let Y = lim Xn . Then Y is a Poisson random n→∞ n variable with parameter λ. Proof. This proof is actually also not too difficult. It just involves some algebra and manipulation of limits. But as usual, I’ll put it in the Appendices (on p. 921).

The following three statements are entirely equivalent: 1. Y is a Poisson random variable with parameter λ. 2. The random variable Y has the Poisson distribution with parameter λ. 3. Y ∼ Po(λ).

71

By the way, the Poisson random variable is a discrete random variable because although its range is not finite, its range is countably-infinite.

Page 611, Table of Contents

www.EconsPhDTutor.com

66.2

When is the Poisson Random Variable an Appropriate Model?

The Poisson random variable is typically used to model the number of “occurrences” or “arrivals” of some phenomenon, within a given timespan or space. We already saw one example where it could be deployed (murders in Singapore). In general, the Poisson random variable is an appropriate model if: 1. (Time-homogeneity.) The rate of occurrences is constant. 2. (Independence.) The probability of occurrence is independent of when the last occurrence took place.

Example 559. Consider the number of goals scored in a given football match. Arguably, an appropriate model for this number is a Poisson random variable, because arguably, 1. The rate of goals is constant. 2. The probability of a goal being scored in, say, the next 60 seconds is independent of when the last goal was scored. Suppose that, on average, the number of goals scored in a football match is 2.3. We can model the number of goals scored with the Poisson random variable X ∼ Po(λ = 2.3). By definition of the Poisson random variable, the probability that 0 goals are scored is P(X = 0) =

λk e−λ 2.30 e−2.3 = = e−2.3 ≈ 0.100. k! 0!

The probability that 3 goals are scored is P(X = 3) =

λk e−λ 2.33 e−2.3 = ≈ 0.203. k! 3!

The probability that less than 3 goals are scored is 2.30 e−2.3 2.31 e−2.3 2.32 e−2.3 P(X ≤ 2) = P(X = 0) + P(X = 1) + P(X = 2) = + + ≈ 0.596. 0! 1! 2!

Page 612, Table of Contents

www.EconsPhDTutor.com

Example 560. Consider the number of public mass shootings in the US in a given year. Arguably, an appropriate model for this number is a Poisson random variable, because arguably, 1. The rate of public mass shootings in the US is constant. 2. The probability of a public mass shooting being committed in, say, the next three days is independent of when the last public mass shooting was committed.

Example 561. Consider the number of observed supernovae (explosions of a star) in a given millennium. Arguably, an appropriate model for this number is a Poisson random variable, because arguably, 1. The rate of supernovae is constant. 2. The probability of a supernova being observed in, say, the next ten years is independent of when the supernova was observed.

Now, no model is perfect. We do not know and may never know the exact processes governing when a goal will be scored, a public mass shooting committed, or a supernova observed. Nonetheless, we can argue that the Poisson random variable works reasonably well as a model. We can make use of this model to analyse the phenomenon at hand. If we choose not to use the Poisson random variable, then our alternatives are to: • Find some alternative model that works better than the Poisson random variable. • Shrug our shoulders and say that the phenomenon cannot be analysed mathematically. The first alternative, if it exists, is great. The second is anti-intellectual and not very useful.

Page 613, Table of Contents

www.EconsPhDTutor.com

Exercise 240. For each the following phenomena, make an argument for whether or not the Poisson random variable is a suitable model. (Answer on p. 1100.) (a) The number of cats killed in northern Singapore, between 2011 and 2020. (b) The number of errors in this textbook. (c) The number of emails you receive, in a given 24-hour timespan. Exercise 241. This exercise revisits Example 560. Suppose the number of public mass shootings in the US in a given year can be modelled by X, a Poisson random variable with parameter λ = 4.2. Compute the probability that there are more than 5 public mass shootings in the US in a given year. (Answer on p. 1100.) Exercise 242. This exercise revisits Example 561. Suppose the number of supernovae observed in a millennium can be modelled by Y , a Poisson random variable with parameter λ = 3.7. Compute the probability that there are no supernovae observed in a given millennium. (Answer on p. 1100.)

Page 614, Table of Contents

www.EconsPhDTutor.com

66.3

The Mean and Variance of the Poisson Random Variable

It turns out that interestingly (and conveniently) enough, the mean and variance of X ∼ Po(λ) are both equal to λ. Fact 77. Let X ∼ Po(λ). Then E[X] = λ and V[X] = λ. Proof. The proof is actually not too difficult, given what we know about Maclaurin series. But as usual, I’ll put it in the Appendices (p. 920).

Page 615, Table of Contents

www.EconsPhDTutor.com

66.4

The Poisson Distribution as an Approximation of the Binomial Distribution

λ We’ve seen that if An ∼ B (n, ), then lim An ∼ Po (λ). n→∞ n

This implies that if X ∼ B (n, p), n is “large enough”, and p is “small enough”, then the random variable Y ∼ Po(λ = np) serve as a “good” approximation for the random variable X.

Different writers give different (and somewhat-arbitrary) rules-of-thumb as to how “large” n and how “small” p must be in order for the Poisson random variable to serve as a “good” approximation. The rule-of-thumb we’ll use in this textbook is this: If n ≥ 30 and p ≤ 0.05, then the Poisson distribution is a “good” approximation to the binomial distribution.

The following example illustrates why we might want to approximate the binomial distribution with the Poisson distribution.

Page 616, Table of Contents

www.EconsPhDTutor.com

Example 562. Problem: We have 300 machines, each of which has, independently, probability 0.02 of breaking down in any given month. What is the probability that at most 10 break down in a given month? Let X ∼ B(300, 0.02) be the number of machines that break down in a given month. Then P(X ≤ 10) = pX (0) + pX (1) + pX (2) + ⋅ ⋅ ⋅ + pX (10) =

⎛ 300 ⎞ ⎛ 300 ⎞ ⎛ 300 ⎞ 0.021 0.99299 + ⋅ ⋅ ⋅ + 0.0210 0.99299 . 0.020 0.99300 + ⎝ 10 ⎠ ⎝ 1 ⎠ ⎝ 0 ⎠

In the old days, it would have been tedious to compute the above probability. So one might instead have preferred to use the Poisson approximation. Now, since n ≥ 30 and p ≤ 0.05, by our rule-of-thumb, Y ∼ Po(np) = Po(6) serves as a suitable approximation to X ∼ B(n = 300, p = 0.02). Thus, P(X ≤ 10) ≈ P(Y ≤ 10).

Now, it would have been easy to find P(Y ≤ 10), because one would have had a print copy of a Poisson table, partly reproduced below. A Poisson table tells us what the value of P(Y ≤ k) is, for various possible values of λ and the number k, given that Y ∼ Po(λ). (For the full table, see sheet titled “Poisson Table” at the usual link.) Reading off the table, we have P(Y ≤ 10) ≈ 0.9574. We thus conclude: The probability that at most 10 machines break down in a given month is approximately 0.9574.

(... Example continued on the next page ...)

Page 617, Table of Contents

www.EconsPhDTutor.com

(... Example continued from the previous page ...) You might wonder, “Well, weren’t there similarly also binomial tables that one could read off of? If so, why then would we need to go through the trouble of approximating the binomial with the Poisson and then refer to the Poisson table? We could just directly refer to the binomial tables.” Now, observe that to print a Poisson table, we need only be concerned with the Poisson parameter λ and the number k. That’s a total of 2 parameters. So we can, in a single table, list a lot of information. In contrast, to print a binomial table, we have two binomial parameters n and p, and in addition the number k. That’s a total of 3 parameters. So a binomial table really involved multiple binomial tables, one for each value of n! (See this example.) Typically, the tables would end at some small-ish value of n (20 in the linked table). Whereas in this particular example, we would have needed the binomial tables all the way to n = 300!

And so, even though there were binomial tables, these were limited and would typically not have furnished the desired information. This, then, was one big reason for using the Poisson approximation, at least in the old days. But today, it is no more difficult to compute P(X ≤ 10) than it is to compute P(Y ≤ 10). For example, using my spreadsheet titled “Binomial” (at the usual link), one can simply punch in n = 300 and p = 0.02 and read off that the exact solution to our problem P(X ≤ 10) is approximately 0.9590. Exercise 243. Suppose the number of deaths by lightning strikes in Singapore in a given year can be modelled by the random variable X ∼ B (5500000, 10−6 ). (Answer on p. 1101.) (a) What is an appropriate interpretation of the numbers 5500000 and 10−6 ? (b) Using a suitable approximation (and justify your use of this approximation), find the probability that at least 5 people are killed by lightning strikes in Singapore in a given year.

Page 618, Table of Contents

www.EconsPhDTutor.com

66.5

The Sum of Two Independent Poisson R.V.’s is a Poisson R.V.

Formally: Theorem 14. Suppose X ∼ Po (λ) and Y ∼ Po (µ) are independent Poisson random variables. Then X + Y ∼ Po (λ + µ). Proof. (Optional.) We’ll prove that the probability distribution of X + Y is that of the Poisson random variable with parameter λ + µ. k

P (X + Y = k) = ∑ P (X + Y = k, X = i) i=0 k

k

= ∑ P (Y = k − i, X = i) = ∑ P (Y = k − i) P (X = i) 1

i=0 k

k

i=0

i=0

= ∑ pY (k − i)pX (i) = ∑

=e

−(λ+µ)

i=0 k−i λ e−λ µi e−µ

(k − i)! i!

1 λk−i µi i=0 (k − i)!i! k

∑

k! e−(λ+µ) k = λk−i µi ∑ k! i=0 (k − i)!i! =

e−(λ+µ) k ⎛ k ⎞ k−i i 2 e−(λ+µ) k λ µ = (λ + µ) , ∑ k! i=0 ⎝ i ⎠ k!

where = uses the independence of X and Y and = uses Fact 67. 1

Page 619, Table of Contents

2

www.EconsPhDTutor.com

Example 563. Problem: There are 34 machines in Room A and 42 in Room B. In any given month, each machine in Room A has, independently, probability 0.03 of breaking down; and each machine in Room B has, independently, probability 0.02 of breaking down. Using a suitable approximation, find the probability that in a given month, more than 2 machines (in total across the two rooms) break down. Let A ∼ B(34, 0.03) and B ∼ B(42, 0.02). We could directly solve this problem by finding P(A + B > 2). It is however quicker if we use the Poisson distribution as an approximation. (We have a simple formula for the sum of two independent Poisson random variables. In contrast, there is no similarly simple formula for the sum of two independent binomial random variables.)

A suitable approximation for A is X ∼ Po(34 × 0.03) = Po(1.02). A suitable approximation for B is Y ∼ Po(42 × 0.02) = Po(0.84). Thus, a suitable approximation for A + B is X + Y ∼ Po(1.02 + 0.84) = Po(1.86). Hence, P(A + B > 2) ≈ P(X + Y > 2) = 1 − P (X + Y ≤ 2) ≈ 0.285.

Exercise 244. (Answer on p. 1101.) Singapore’s population is 5,500,000 and Malaysia’s is 30,000,000. In any given year, each Singaporean has, independently, probability 10−6 of being killed by a lightning strike; and each Malaysian has, independently, probability 10−7 of suffering the same fate.

Using a suitable approximation, find the probability that in any given year, at least 10 people are killed by lightning strikes in Singapore and Malaysia, combined.

Page 620, Table of Contents

www.EconsPhDTutor.com

The sum of two independent Poisson random variables is itself a Poisson random variable. We might thus wonder, “Is the difference of two independent Poisson random variables also a Poisson random variable?” Unfortunately, the answer is no. A trivial reason for this is that the difference of two independent Poisson random variables can take on negative values. In contrast, the Poisson random variable always takes on positive values. To illustrate: Example 563 (continued from above). Reproduced from above: There are 34 machines in Room A and 42 in Room B. In any given month, each machine in Room A has, independently, probability 0.03 of breaking down; and each machine in Room B has, independently, probability 0.02 of breaking down. Let A ∼ B(34, 0.03) and B ∼ B(42, 0.02). We now show that B − A is not a Poisson random variable.

The range of A and B are both Z+0 = {0, 1, 2, . . . }. Thus, the range of B − A is Z = {⋅ ⋅ ⋅ − 2, −1, 0, 1, 2, . . . }. By the definition of a Poisson random variable then (Definition 124), B − A cannot possibly be a Poisson random variable.

Page 621, Table of Contents

www.EconsPhDTutor.com

67

The Continuous Uniform Distribution

So far, all examples of random variables we’ve seen have been discrete. For example, the binomial random variable X ∼ B (n, p) is discrete, because Range (X) = {0, 1, 2, . . . , n} is finite.

We’ll now look at continuous random variables. Informally, a random variable Y is continuous if its range takes on a continuum of values. For H2 Maths, you need only learn about one continuous random variable: the normal random variable (subject of the next chapter). Nonetheless, we’ll first look at another continuous random variable that is not in the syllabus. This is the continuous uniform random variable. It is much simpler than the normal random variable and can thus help build up your intuition of how continuous random variables work.

Page 622, Table of Contents

www.EconsPhDTutor.com

67.1

The Continuous Uniform Distribution

A line measuring exactly 1 metre in length is drawn on the floor. It is about to rain. Let X be the position of the first rain-drop that hits the line. X is measured as the distance (in metres) from the left-most point of the line. So for example, if the first rain-drop hits the left-most point of the line, then x = 0. If it hits the exact midpoint of the line, then x = 0.5. And if it hits the right-most point, then x = 1. Assume we can measure X to infinite precision.

Then, assuming the first rain-drop is equally likely to hit any point of the line, we can model X as a continuous uniform random variable on [0, 1]. This says that • The range of X is [0, 1] (the first rain-drop can hit any point along the line); and

• X is equally likely to take on any value in the interval [0, 1] (the first rain-drop is equally likely to hit any point along the line). The following three statements are entirely equivalent: 1. X is a continuous uniform random variable on [0, 1].

2. X is a random variable with the continuous uniform distribution on [0, 1]. 3. X ∼ U [0, 1].

Recall that previously with any discrete random variable Y , we could find its probability distribution. That is, we could find P (Y = k) (the probability that Y takes on the value k). For example, if Y ∼ B (3, 0.5) modelled the number of heads in three coin-flips, then ⎛3⎞ 1 2 3 the probability that there was one heads was P (Y = 1) = 0.5 0.5 = . 8 ⎝1⎠

Now, in contrast, for any continuous random variable X, strangely enough, there is zero probability that X takes on any particular value! For example, if X ∼ U [0, 1], then P (X = 0.37) = 0. That is, there is zero probability that X takes on the value of 0.37! At first glance, this may seem strange.

But remember: There are infinitely-many real numbers in the interval [0, 1]. So it makes sense to say that the probability of X taking on any particular value is zero.72 72

But strangely enough, zero probability is not the same thing as impossible. For example, we’d say that

• There is zero probability, but it is not impossible that X ∼ U [0, 1] takes on the value 0.37. • There is zero probability and it is impossible that X ∼ U [0, 1] takes on the value 1.2. (Actually, rather than use the word “impossible”, mathematicians prefer saying “almost never”, which has a precise definition.)

Page 623, Table of Contents

www.EconsPhDTutor.com

So for any continuous random variable X, it is pointless to try to write down P (X = k) for different possible values of k, because P (X = k) is always equal to zero (regardless of what k is). Instead, we shall try to write down P (a ≤ X ≤ b), for different possible values of a and b.

Now, if X ∼ U [0, 1], then the probability that X takes on values between 0.3 and 0.7 is simply 0.7 − 0.3 = 0.4. That is, P (0.3 ≤ X ≤ 0.7) = 0.7 − 0.3 = 0.4.

Similarly, the probability that X takes on values between 0.16 and 0.35 is simply 0.35−0.16 = 0.19. That is, P (0.16 ≤ X ≤ 0.35) = 0.35 − 0.16 = 0.19.

The above observations suggest that it may be useful to define a new concept, called the cumulative distribution function.

Page 624, Table of Contents

www.EconsPhDTutor.com

67.2

The Cumulative Distribution Function (CDF)

The CDF simply tells us the probability that X takes on values less than or equal to k, for every k ∈ R. Formally:

Definition 125. The cumulative distribution function (CDF) of a random variable X is the function FX ∶ R → R given by the mapping rule FX (k) = P (X ≤ k) .

It turns out that every random variable can be uniquely defined by giving its CDF. For example, the continuous uniform random variable is formally defined thus: Definition 126. X is the continuous uniform random variable on [0, 1] if its CDF FX ∶ R → R is defined by ⎧ ⎪ ⎪ 0, ⎪ ⎪ ⎪ ⎪ FX (k) = ⎨k, ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩1,

if k < 0,

if k ∈ [0, 1], if k > 1.

Armed with the concept of the CDF, the formal definition of a continuous random variable can be simply stated: Definition 127. A random variable X is continuous if its CDF FX is continuous. We can now summarise the three possible types of random variables. 1. Discrete random variables. A random variable is discrete if its range is finite.73 Examples: Bernoulli, binomial. 2. Continuous random variables. A random variable is continuous if its CDF is continuous. Examples: Continuous uniform, normal. 3. Other random variables. There are random variables that are neither discrete nor continuous. But you will not study any of these for the A-levels.

Note that every random variable (discrete, continuous, or otherwise) has a cumulative distribution function (CDF). 73

Or countably-infinite.

Page 625, Table of Contents

www.EconsPhDTutor.com

67.3

Important Digression: P (X ≤ k) = P (X < k)

For any continuous random variable X, we have

P (X ≤ k) = P (X < k) .

That is, whether an inequality is strict makes no difference. The reason is that by the third Kolmogorov axiom (additivity), P (X ≤ k) = P (X < k) + P (X = k) = P (X < k) + 0 = P (X < k) .

Thus, for continuous random variables, it doesn’t matter whether inequalities are strict or weak. Example 564. Let X ∼ U [0, 1]. Then

P (0.2 ≤ X ≤ 0.5) = P (0.2 < X ≤ 0.5) = P (0.2 ≤ X < 0.5) = P (0.2 < X < 0.5) .

Page 626, Table of Contents

www.EconsPhDTutor.com

67.4

The Probability Density Function (PDF)

The PDF is simply defined as the derivative of the CDF.74 Definition 128. Let X be a random variable whose CDF FX is differentiable. Then the probability density function (PDF) of X is the function fX ∶ R → R defined by fX (k) =

d FX (k). dk

The PDF has an intuitive interpretation. The area under the PDF between points a and b is equal to P (a ≤ X ≤ b). This, of course, is simply a consequence of the Fundamental Theorems of Calculus: ∫a

b

fX (k)dk = ∫

b a

d FTC FX (k)dk = FX (b) − FX (a) = P(X ≤ b) − P(X ≤ a) = P(a ≤ X ≤ b). dk

The PDF of X ∼ U[0, 1] (graphed below) is simply the function fX ∶ R → R defined by fX (k) = 1,

if k ∈ [0, 1],

and fX (k) = 1,

otherwise.

For any a ≤ b, the area under the PDF between a and b is precisely P (a ≤ X ≤ b). For example, there is probability 0.25 (red area) that X takes on values between 0.5 and 0.75. There is probability 0.1 (blue area) that X takes on values between 0.2 and 0.3. Exercise 245. The continuous uniform random variable Y ∼ U[3, 5] is equally likely to take on values between 3 and 5, inclusive. (a) Write down its CDF FY . (b) Write down and graph its PDF fY . (c) Compute, and also illustrate on your graph, the quantities P (3.1 ≤ Y ≤ 4.6) and P (4.8 ≤ Y ≤ 4.9). (Answer on p. 1102.)

74

Note that although every random variable has a CDF, not every random variable has a PDF. In particular, if the random variable’s CDF is not differentiable, then by our definition here, the random variable does not have a PDF.

Page 627, Table of Contents

www.EconsPhDTutor.com

68

The Normal Distribution

The standard normal (or Gaussian) random variable (SNRV) is very important. In fact, it is so important that we usually reserve the letter Z for it, and the Greek letters φ and Φ (lower- and upper-case phi) for its PDF and CDF. The following three statements are entirely equivalent: 1. Z is a SNRV. 2. Z is a random variable with the standard normal distribution. 3. Z ∼ N (0, 1). Here’s the formal definition:

Definition 129. Z is called a standard normal random variable (SNRV) if its PDF φ ∶ R → R is defined by: 2 1 φ(a) = √ e−0.5a . 2π

For the A-levels, you need not remember this complicated-looking PDF. Nor need you understand where it comes from. The normal PDF is often also referred to as the bell curve, due to its resemblance to a bell (kinda).

As with the continuous uniform, for any a ≤ b, the area under the normal PDF between a and b gives us precisely P (a ≤ X ≤ b). For example, there is probability 0.25 (red area) that X takes on values between 0.5 and 0.75. There is probability 0.1 (blue area) that X takes on values between 0.2 and 0.3. Page 628, Table of Contents

www.EconsPhDTutor.com

As usual, the CDF Φ ∶ R → R is defined by: Φ(a) = P (Z ≤ a) = ∫

a

−∞

φ(x)dx = ∫

2 1 √ e−0.5x dx. −∞ 2π

a

Unfortunately, this last integral has no simpler expression (mathematicians would say that it has no “closed-form expression”). Instead, as we’ll soon see, we have to use the so-called Z-tables (or a graphing calculator) to look up values of Φ(k). The next fact summarises the properties of the normal distribution. Some of these properties are illustrated in the figure that follows. Fact 78. Let Z ∼ N(0, 1) and φ and Φ be its PDF and CDF.

1. Φ(∞) = 1. (As with any random variable, the area under the entire PDF is 1.) 2. φ(a) > 0, for all a ∈ R. (The PDF is positive everywhere. This has a surprising implication: however large a is, there is always some non-zero probability that Z ≥ a.) 3. E [Z] = 0. (The mean of Z is 0.) 4. The PDF φ reaches a global maximum at the mean 0. (In fact, we can go ahead and 1 compute φ (0) = √ ≈ 0.399.) 2π 5. V [Z] = 1. (The variance of Z is 1.) 6. P (Z ≤ a) = P (Z < a). (We’ve already discussed this earlier. It makes no difference whether the inequality is strict. This is because P(Z = a) = 0.) 7. The PDF φ is symmetric about the mean. This has several implications:

(a) P (Z ≥ a) = P (Z ≤ −a) = Φ(−a). (b) Since P (Z ≥ a) = 1 − P (Z ≤ a) = 1 − Φ(a), it follows that Φ(−a) = 1 − Φ(a) or, equivalently, Φ(a) = 1 − Φ(−a). (c) Φ(0) = 1 − Φ(0) = 0.5.

8. P (−1 ≤ Z ≤ 1) = Φ (1) − Φ (−1) ≈ 0.6827. (There is probability 0.6827 that Z takes on values within 1 standard deviation of the mean.) 9. P (−2 ≤ Z ≤ 2) = Φ (2) − Φ (−2) ≈ 0.9545. (There is probability 0.9545 that Z takes on values within 2 standard deviations of the mean.) 10. P (−3 ≤ Z ≤ 3) = Φ (3) − Φ (−3) ≈ 0.9973. (There is probability 0.9973 that Z takes on values within 3 standard deviations of the mean.)

11. The PDF φ has two points of inflexion, namely at ±1. (The points of inflexion are one standard deviation away from the mean.) Proof. Optional, see p. 922 in the Appendices.

Page 629, Table of Contents

www.EconsPhDTutor.com

-4

-3

Page 630, Table of Contents

-2

-1

0

1

2

3

4

www.EconsPhDTutor.com

Example Example 565. 565. Let’s Let’s use use the the TI84 TI84 to to find find Φ(2.51). Φ(2.51). 1. corresponds to to the the VARS VARS button). button). 1. Press Press the the blue blue 2ND 2ND button button and and then then DISTR DISTR (which (which corresponds This This brings brings up up the the DISTR DISTR menu. menu. 2. to select select the the “normalcdf” “normalcdf” option. option. 2. Press Press 22 to The Since Φ(2.51) Φ(2.51) = = Φ(2.51)−Φ(−∞), Φ(2.51)−Φ(−∞), The TI84 TI84 is is now now asking asking for for your your lower lower and and upper upper bounds. bounds. Since your your lower lower bound bound is is −∞ −∞ and and your your upper upper bound bound is is 2.51. 2.51.

99 3. So instead, instead, you’ll you’ll enter enter −10 −1099 which is is 3. But But there’s there’s no no way way to to enter enter −∞ −∞ on on your your TI84. TI84. So ,, which (-) ,, the the blue blue 2ND 2ND button, button, EE EE simply press (-) simply aa very very large large negative negative number. number. To To do do so, so, press (which (Don’t press press ENTER ENTER yet!) yet!) (which corresponds corresponds to to the the ,, button), button), and and then then 99 99 .. (Don’t simply demarcates demarcates your your lower lower and and 4. (this simply 4. Now Now to to enter enter your your upper upper bound. bound. First First press press ,, (this upper by pressing pressing 22 .. 55 11 .. Then Then press press upper bounds). bounds). Then Then enter enter your your upper upper bound bound 2.51 2.51 by ENTER ≈ 0.99396. 0.99396. ENTER .. Your Your TI84 TI84 says says that that the the answer answer is is Φ(2.51) Φ(2.51) ≈

After Step 1.

-4

-3

Page Page 631, 631, Table Table of of Contents Contents

After Step 2.

-2

-1

After Step 3.

0

1

After Step 4.

2

3

4

www.EconsPhDTutor.com www.EconsPhDTutor.com

Example 566. To find Φ(−2.51), Φ(1.372), and P (−4 ≤ Z ≤ 4), the steps are very similar. So for each, I’ll simply give the screenshot from the TI84: Φ(−2.51)

-4

-3

-2

-1

0

1

P (−4 ≤ Z ≤ 4)

Φ(1.372)

2

3

4

-4

-3

-2

-1

0

1

2

3

4

Example 567. We’ll find Φ(2.51), Φ(−2.51), Φ(1.372), and P (−4 ≤ Z ≤ 4) using Z-tables.

Refer to the Z-tables on p. 633. (These are the exact same tables that appear on the List of Formulae you’ll get during exams.)

• To find Φ(2.51), look at the row labelled 2.5 and the column labelled 1 — read off the number 0.9940. We thus have Φ(2.51) = 0.9940. • To find Φ(−2.51), note that the table does not explicitly give values of Φ(z), if z < 0. But we can exploit the fact that the standard normal is symmetric about the mean µ = 0. This fact implies that Φ(−z) = 1 − Φ(z). Hence, Φ(−2.51) = 1 − Φ(2.51) = 0.0060.

• To find Φ(1.372), first look at the row labelled 1.3 and the column labelled 7 — read off the number 0.9147. This tells us that Φ(1.37) = 0.9147. Now look at the right end of the table (where it says “ADD”). Since the third decimal place of 1.372 is 2, we look under the column labelled 2 — this tells us to ADD 3. Thus, Φ(1.372) = 0.9147+0.003 = 0.9150. • To find P (−4 ≤ Z ≤ 4), the Z-tables printed are actually useless, because they only go to 2.99. So you can just write P (−4 ≤ Z ≤ 4) ≈ 1. Exercise 246. Using both the Z-tables and your graphing calculator, find the following: (a) P (Z ≥ 1.8). (b) P (−0.351 < Z < 1.2). (Answer on p. 1103.)

Page 632, Table of Contents

www.EconsPhDTutor.com

THE NORMAL DISTRIBUTION FUNCTION If Z has a normal distribution with mean 0 and variance 1 then, for each value of z, the table gives the value of (z) , where (z )  P(Z  z).

For negative values of z use (z)  1  (z) . 1

2

3

4

7 8

9

0.5359 0.5753 0.6141 0.6517 0.6879

4 4 4 4 4

8 8 8 7 7

12 12 12 11 11

16 16 15 15 14

28 28 27 26 25

32 32 31 30 29

36 36 35 34 32

0.7190 0.7517 0.7823 0.8106 0.8365

0.7224 0.7549 0.7852 0.8133 0.8389

3 3 3 3 3

7 10 14 17 20 24 7 10 13 16 19 23 6 9 12 15 18 21 5 8 11 14 16 19 5 8 10 13 15 18

27 26 24 22 20

31 29 27 25 23

0.8577 0.8790 0.8980 0.9147 0.9292

0.8599 0.8810 0.8997 0.9162 0.9306

0.8621 0.8830 0.9015 0.9177 0.9319

2 2 2 2 1

5 4 4 3 3

7 6 6 5 4

9 12 14 16 19 21 8 10 12 14 16 18 7 9 11 13 15 17 6 8 10 11 13 14 6 7 8 10 11 13

0.9406 0.9515 0.9608 0.9686 0.9750

0.9418 0.9525 0.9616 0.9693 0.9756

0.9429 0.9535 0.9625 0.9699 0.9761

0.9441 0.9545 0.9633 0.9706 0.9767

1 1 1 1 1

2 2 2 1 1

4 3 3 2 2

5 4 4 3 2

6 5 4 4 3

7 6 5 4 4

8 7 6 5 4

10 8 7 6 5

11 9 8 6 5

0.9798 0.9842 0.9878 0.9906 0.9929

0.9803 0.9846 0.9881 0.9909 0.9931

0.9808 0.9850 0.9884 0.9911 0.9932

0.9812 0.9854 0.9887 0.9913 0.9934

0.9817 0.9857 0.9890 0.9916 0.9936

0 0 0 0 0

1 1 1 1 0

1 1 1 1 1

2 2 1 1 1

2 2 2 1 1

3 2 2 2 1

3 3 2 2 1

4 3 3 2 2

4 4 3 2 2

0.9946 0.9960 0.9970 0.9978 0.9984

0.9948 0.9961 0.9971 0.9979 0.9985

0.9949 0.9962 0.9972 0.9979 0.9985

0.9951 0.9963 0.9973 0.9980 0.9986

0.9952 0.9964 0.9974 0.9981 0.9986

0 0 0 0 0

0 0 0 0 0

0 0 0 0 0

1 0 0 0 0

1 1 0 0 0

1 1 1 0 0

1 1 1 0 0

1 1 1 1 0

1 1 1 1 0

z

0

1

2

3

4

5

6

7

8

9

0.0 0.1 0.2 0.3 0.4

0.5000 0.5398 0.5793 0.6179 0.6554

0.5040 0.5438 0.5832 0.6217 0.6591

0.5080 0.5478 0.5871 0.6255 0.6628

0.5120 0.5517 0.5910 0.6293 0.6664

0.5160 0.5557 0.5948 0.6331 0.6700

0.5199 0.5596 0.5987 0.6368 0.6736

0.5239 0.5636 0.6026 0.6406 0.6772

0.5279 0.5675 0.6064 0.6443 0.6808

0.5319 0.5714 0.6103 0.6480 0.6844

0.5 0.6 0.7 0.8 0.9

0.6915 0.7257 0.7580 0.7881 0.8159

0.6950 0.7291 0.7611 0.7910 0.8186

0.6985 0.7324 0.7642 0.7939 0.8212

0.7019 0.7357 0.7673 0.7967 0.8238

0.7054 0.7389 0.7704 0.7995 0.8264

0.7088 0.7422 0.7734 0.8023 0.8289

0.7123 0.7454 0.7764 0.8051 0.8315

0.7157 0.7486 0.7794 0.8078 0.8340

1.0 1.1 1.2 1.3 1.4

0.8413 0.8643 0.8849 0.9032 0.9192

0.8438 0.8665 0.8869 0.9049 0.9207

0.8461 0.8686 0.8888 0.9066 0.9222

0.8485 0.8708 0.8907 0.9082 0.9236

0.8508 0.8729 0.8925 0.9099 0.9251

0.8531 0.8749 0.8944 0.9115 0.9265

0.8554 0.8770 0.8962 0.9131 0.9279

1.5 1.6 1.7 1.8 1.9

0.9332 0.9452 0.9554 0.9641 0.9713

0.9345 0.9463 0.9564 0.9649 0.9719

0.9357 0.9474 0.9573 0.9656 0.9726

0.9370 0.9484 0.9582 0.9664 0.9732

0.9382 0.9495 0.9591 0.9671 0.9738

0.9394 0.9505 0.9599 0.9678 0.9744

2.0 2.1 2.2 2.3 2.4

0.9772 0.9821 0.9861 0.9893 0.9918

0.9778 0.9826 0.9864 0.9896 0.9920

0.9783 0.9830 0.9868 0.9898 0.9922

0.9788 0.9834 0.9871 0.9901 0.9925

0.9793 0.9838 0.9875 0.9904 0.9927

2.5 2.6 2.7 2.8 2.9

0.9938 0.9953 0.9965 0.9974 0.9981

0.9940 0.9955 0.9966 0.9975 0.9982

0.9941 0.9956 0.9967 0.9976 0.9982

0.9943 0.9957 0.9968 0.9977 0.9983

0.9945 0.9959 0.9969 0.9977 0.9984

5 6 ADD 20 20 19 19 18

24 24 23 22 22

Critical values for the normal distribution If Z has a normal distribution with mean 0 and variance 1 then, for each value of p, the table gives the value of z such that P(Z  z) = p. p z

0.75 0.674

0.90 1.282

0.95 1.645

0.975 1.960

0.99 2.326

0.995 2.576

0.9975 2.807

0.999 3.090

0.9995 3.291

68.1

The Normal Distribution, in General

Let Z ∼ N(0, 1) be the SNRV and σ, µ ∈ R be constants.

Consider σZ + µ, itself a random variable. We know that since E [Z] = 0 and V [Z] = 1, it follows that E [σZ + µ] = σE [Z] + µ = µ and V [σZ + µ] = σ 2 V [Z] = σ 2 .

It turns out that σZ + µ is a normal random variable with mean µ and variance σ 2 :

Definition 130. X is called a normal random variable with mean µ and variance σ 2 if its PDF fX ∶ R → R is defined by: a−µ 2 1 fX (a) = √ e−0.5( σ ) . σ 2π

Once again, for the A-levels, you need not remember this complicated-looking PDF. Nor need you understand where it comes from. The following three statements are entirely equivalent: 1. X is a normal random variable with mean µ and variance σ 2 . 2. X is a random variable with normal distribution of mean µ and variance σ 2 . 3. X ∼ N (µ, σ 2 ).

Page 634, Table of Contents

www.EconsPhDTutor.com

Example 568. The normal random variables A ∼ N(−1, 1), B ∼ N(1, 1), and C ∼ N(2, 1) have variance 1 (just like the SNRV), but non-zero means. Their PDFs are graphed below. (Included for reference is the standard normal PDF in black.) We see that the effect of increasing the mean µ is to move the graph of the PDF rightwards. And decreasing the mean moves it leftwards.

Page 635, Table of Contents

www.EconsPhDTutor.com

Example 569. The normal random variables D ∼ N(0, 0.1), E ∼ N(0, 2), and F ∼ N(0, 3) have mean 0 (just like the SNRV), but non-unit variances. Their PDFs are graphed below. (Included for reference is the standard normal PDF in black.) The effect of changing the variance σ 2 is this: • The larger the variance, the “fatter” the “tails” of the PDF and the shorter the peak. • Conversely, the smaller the variance, the “thinner” the “tails” of the PDF and the taller the peak.

Page 636, Table of Contents

www.EconsPhDTutor.com

Example 570. The normal random variables G ∼ N(−1, 0.1), H ∼ N(1, 2), and I ∼ N(2, 3) have non-zero means and non-unit variances. Their PDFs are graphed below. (Included for reference is the standard normal PDF in black.)

Exercise 247. Let X ∼ N(µ, σ 2 ). Verify that if µ = 0 and σ 2 = 1, then for all a ∈ R, we have fX (a) = φ(a). What can you conclude? (Answer on p. 1104.)

Page 637, Table of Contents

www.EconsPhDTutor.com

In general, normality is preserved under linear transformations: Fact 79. Let X ∼ N (µ, σ 2 ) and a, b ∈ R be constants. Then aX + b ∼ N (aµ + b, a2 σ 2 ). Proof. Optional, see p. 924 in the Appendices. Thus, we can easily transform any normal random variable into the SNRV: Corollary 6. If X ∼ N (µ, σ 2 ), then

X −µ = Z ∼ N(0, 1). Equivalently, X = σZ + µ. σ

(Just to be clear, two random variables are identical if their CDFs are identical.) Proof. The next exercise asks you to prove this corollary.

Exercise 248. Using Fact 79, prove that if X ∼ N (µ, σ 2 ), then

(Answer on p. 1104.)

X −µ = Z ∼ N(0, 1). σ

The above corollary gives us an alternative method for computing probabilities associated with normal random variables. In general, if X ∼ N (µ, σ 2 ), then P (X ≤ c) = P (σZ + µ ≤ c) = P (Z ≤

Page 638, Table of Contents

c−µ c−µ ) = Φ( ). σ σ

www.EconsPhDTutor.com

The properties that we listed for the SNRV also apply, with only a few modifications, to any NRV. I highlight any differences in red. The figure that follows illustrates. Fact 80. Let X ∼ N (µ, σ 2 ) and let fX and FX be the PDF and CDF of X.

1. Φ(∞) = 1. (The area under the entire PDF is 1. This, of course, is true of any random variable.) 2. φ(a) > 0, for all a ∈ R. (The PDF is positive everywhere. This has the surprising implication that no matter how large a is, there is always some non-zero probability that Z ≥ a.) 3. E [X] = µ. (The mean of Z is µ.) 4. The PDF fX reaches a global maximum at the mean µ. (In fact, we can go ahead and 0.399 1 .) compute fX (µ) = √ ≈ σ σ 2π 5. V [X] = σ 2 . (The variance of X is σ 2 .)

6. P (Z ≤ a) = P (Z < a). (We’ve already discussed this earlier. It makes no difference whether the inequality is strict. This is because P(Z = a) = 0.) 7. The PDF φ is symmetric about the mean. This has several implications:

(a) P (X ≥ µ + a) = P (X ≤ µ − a) = FX (µ − a). (b) Since P (X ≥ µ + a) = 1 − P (X ≤ µ + a) = 1 − FX (µ + a), it follows that FX (µ − a) = 1 − FX (µ + a) or, equivalently, FX (µ + a) = 1 − FX (µ − a). (c) FX (µ) = 1 − FX (µ) = 0.5.

8. P (µ − σ ≤ X ≤ µ + σ) = Φ (1) − Φ (−1) ≈ 0.6827. (There is probability 0.6827 that X takes on values within 1 standard deviation of the mean.)

9. P (µ − σ ≤ X ≤ µ + σ) = Φ (2) − Φ (−2) ≈ 0.9545. (There is probability 0.9545 that X takes on values within 2 standard deviations of the mean.) 10. P (µ − σ ≤ X ≤ µ + σ) = Φ (3) − Φ (−3) ≈ 0.9973. (There is probability 0.9973 that X takes on values within 3 standard deviations of the mean.) 11. The PDF φ has two points of inflexion, namely at ±σ. (The points of inflexion are one standard deviation away from the mean.) Proof. See the next exercise.

Exercise 249. Prove all of the properties listed in Fact 80. (Hint: Use Corollary 6 to convert X into the SNRV. Then simply apply Fact 78.) (Answer on p. 1105.)

Page 639, Table of Contents

www.EconsPhDTutor.com

Page 640, Table of Contents

www.EconsPhDTutor.com

Example 571. Let G ∼ N(−1, 0.1), H ∼ N(1, 2), and I ∼ N(2, 3). We’ll find P (G < 2) using our TI84. The first few steps are similar to before:

1. Press the blue 2ND button and then VARS (which corresponds to the DISTR button). This brings up the DISTR menu.

2. Press 2 to select the “normalcdf” option.

3. Enter the lower bound −1099 by pressing (-) , the blue 2ND button, EE (which corresponds to the , button), and then 9 9 . (Don’t press ENTER yet!) 4. Enter the upper bound 2 by pressing , and 2 . (Don’t press ENTER yet!!). After Step 1.

After Step 2.

After Step 3.

After Step 4.

Previously, we didn’t bother telling the TI84 our mean µ and standard deviation σ. And so by default, if we pressed ENTER at this point, the TI84 simply assumed that we wanted the SNRV Z ∼ N(0, 1). Now we’ll tell the TI84 what µ and σ are: 5. First enter the mean µ = −1. Press , (-) 1 . √ √ 6. Now enter the standard deviation σ = 0.1 (and not the variance). Press , 0 . 1 ) . Finally, press ENTER . The TI84 says that P (G < 2) ≈ 1. After Step 5.

After Step 6.

Finding P (H < 2), P (I < 2), P (−1 < G < 1), P (−1 < H < 1), and P (−1 < I < 1) is similar: P (H < 2) and P (I < 2)

P (−1 < G < 1)

P (−1 < H < 1)

P (−1 < I < 1)

Since I has mean µ = 2, we should have exactly P (I < 2) = 0.5. So here the TI84 has actually made a small error in reporting instead that P (I < 2) ≈ 0.5000000005. Page 641, Table of Contents

www.EconsPhDTutor.com

Example 572. We now redo the previous two examples, but use Z-tables:

P (G < 2) = P (Z < P (H < 2) = P (Z < P (I < 2) = P (Z <

2 − µG 2 − (−1) = √ ≈ 9.4868) = Φ (9.4868) ≈ 1, σG 0.1

2 − µH 2 − 1 = √ ≈ 0.7071) = Φ (0.7071) ≈ 0.7601, σH 2 2 − µI 2 − 2 = √ = 0) = Φ (0) = 0.5, σI 3

−1 − (−1) 1 − (−1) √
P (−1 < G < 1) = P (0 =

−1 − 1 1−1 P (−1 < H < 1) = P (−1.4142 ≈ √ < Z < √ = 0) 2 2 = Φ(0) − Φ(−1.4142) ≈ 0.5 − [1 − Φ(1.4142)] = Φ(1.4142) − 0.5 ≈ 0.9213 − 0.5 = 0.4213,

−1 − 2 1−2 P (−1 < I < 1) = P (−1.7321 ≈ √ < Z < √ ≈ −0.5774) 3 3 = Φ(−0.5774) − Φ(−1.7321) = 1 − Φ(0.5774) − [1 − Φ(1.7321)] ≈ 0.9584 − 0.7182 = 0.2402.

Exercise 250. Let X ∼ N(2.14, 5) and Y ∼ N(−0.33, 2). Using both the Z-tables and your graphing calculator, find the following: (a) P (X ≥ 1) and P (Y ≥ 1). (b) P (−2 ≤ X ≤ −1.5) and P (−2 ≤ Y ≤ −1.5). (Answer on p. 1106.)

Page 642, Table of Contents

www.EconsPhDTutor.com

68.2

Sum of Independent Normal Random Variables

Theorem 15. If X and Y are independent normal random variables, then X + Y is also a normal random variable. Moreover, X − Y is also a normal random variable. Proof. Omitted. We already knew from before that E [X ± Y ] = E [X] ± E [Y ]. Moreover, if X and Y are independent, then V [X ± Y ] = V [X] + V [Y ]. Thus, the above theorem implies: 2 ) and Y ∼ N (µY , σY2 ) be independent and a, b ∈ R Corollary 7. Let X ∼ N (µX , σX 2 be constants. Then X + Y ∼ N (µX + µY , σX + σY2 ) and more generally, aX + bY ∼ 2 N (aµX + bµY , a2 σX + b2 σY2 ). 2 Moreover, X − Y ∼ N (µX − µY , σX + σY2 ) 2 N (aµX − bµY , a2 σX + b2 σY2 ).

and

more

generally,

aX − bY

∼

Examples:

Page 643, Table of Contents

www.EconsPhDTutor.com

Example 573. The weight (in kg) of a sumo wrestler is modelled by X ∼ N (200, 50). Assume that the weight of each sumo wrestler is independent of the weight of any other sumo wrestler. We randomly choose two sumo wrestlers. (a) What is the probability that their total weight is greater than 405 kg? (b) What is the probability that one is more than 10% heavier than that the other? (a) Let X1 ∼ N (200, 50) and X2 ∼ N (200, 50) be the weight of the first and second sumo wrestler. Then X1 + X2 ∼ N (400, 100). Thus, P (X1 + X2 > 405) = P (Z >

405 − 400 √ ) = P (Z > 0.5) = 1 − Φ (0.5) ≈ 1 − 0.6915 = 0.3085. 100

(b) Our goal is to find p = P (X1 > 1.1X2 ) + P (X2 > 1.1X1 ). This is the probability that the first sumo wrestler is more than 10% heavier than the second, plus the probability that the second is more than 10% heavier than the first. Of course, by symmetry, these two probabilities are equal. Thus, p = 2 × P (X1 > 1.1X2 ). Now, P (X1 > 1.1X2 ) = P (X1 − 1.1X2 > 0) .

But X1 − 1.1X2 ∼ N (200 − 1.1 ⋅ 200, 50 + 1.12 ⋅ 50) = N (−20, 110.5). Thus, 0 − (−20) P (X1 > 1.1X2 ) = P (X1 − 1.1X2 > 0) = P (Z > √ ) 110.5

≈ P (Z > 1.9026) = 1 − Φ (1.9026) ≈ 1 − 0.9714 = 0.0286.

Altogether then, p = 2P (X1 > 1.1X2 ) = 2 × 0.0286 = 0.0572.

Page 644, Table of Contents

www.EconsPhDTutor.com

Example 574. The weight (in kg) of a caught fish is modelled by X ∼ N (1, 0.4). The weight (in kg) of a caught shrimp is modelled by Y ∼ N (0.1, 0.1). Assume that the weights of any caught fish and shrimp are independent. (a) What is the probability that the total weight of 4 caught fish and 50 caught shrimp is greater than 10 kg?

(b) What is the probability that a caught fish weighs more than 9 times as much as a caught shrimp? (a) Let S be the total weight of 4 caught fish and 50 caught shrimp. Note, importantly, that it would be wrong to write S = 4X + 50Y , because 4X + 50Y would be 4 times the weight of a single caught fish, plus 50 times the weight of a single caught shrimp.

In contrast, we want Z to be the sum of the weights of 4 independent fish and 50 independent shrimp. Thus, we should instead write S = X1 + X2 + X3 + X4 + Y1 + Y2 + ⋅ ⋅ ⋅ + Y50 , where • X1 ∼ N (1, 0.4), X2 ∼ N (1, 0.4), X3 ∼ N (1, 0.4), and X4 ∼ N (1, 0.4) are the weights of each caught fish.

• Y1 ∼ N (0.1, 0.1), Y2 ∼ N (0.1, 0.1), . . . , and Y50 ∼ N (0.1, 0.1) are the weights of each caught shrimp. Now, S ∼ N (4 × 1 + 50 × 0.1, 4 × 0.4 + 50 × 0.1) = N (9, 6.6).

(Note by the way that in contrast, 4X +50Y ∼ N (9, 42 × 0.4 + 502 × 0.1) = N (9, 256.4), which has a rather different variance!) Thus, P (S > 10) ≈ 0.3485 (calculator).

(b) P (X > 9Y ) = P (X − 9Y > 0). But X − 9Y ∼ N (1 − 9 × 0.1, 0.4 + 92 × 0.1) = N (0.1, 8.5). Thus, P (X − 9Y > 0) ≈ 0.5137 (calculator).

Page 645, Table of Contents

www.EconsPhDTutor.com

Exercise 251. (Answer on p. 1107.) Water and electricity usage are billed, respectively, at $2 per 1, 000 litres (l) and $0.30 per kilowatt-hour (kWh). Assume that each month, the amount of water used by Ahmad (and his family) at their HDB flat is normally distributed with mean 25, 000 l and variance 64, 000, 000 l2 . Similarly, the amount of electricity they use is normally distributed with mean 200 kWh and variance 10, 000 kWh2 . Assume that monthly water usage and electricity usage are independent. (a) Find the probability that their total water and electricity utility bill in any given month exceeds $100. (b) Find the probability that their total water and electricity utility bill in any given year exceeds $1, 000. Suppose instead that electricity usage is billed at $x per kWh. (c) Then what is the maximum value of x, in order for the probability that the total utility bill in a given month exceeds $100 is 0.1 or less?

Page 646, Table of Contents

www.EconsPhDTutor.com

68.3

The Central Limit Theorem and The Normal Approximation

Suppose we have n independent random variables, each identically-distributed with mean µ ∈ R and variance σ 2 ∈ R. Then informally, the Central Limit Theorem (CLT) says that if n is “large enough”, then their sum (which is also a random variable) has the approximate distribution N (nµ, nσ 2 ). Formally:

Theorem 16. (The Central Limit Theorem.) Let X1 , X2 , . . . , Xn be random variables. Suppose (i) they are independent; and (ii) they are identically-distributed, with mean µ ∈ R and variance σ 2 ∈ R. Then the sum ∑ X = X1 + X2 + ⋅ ⋅ ⋅ + Xn converges in distribution to N (nµ, nσ 2 ). n

i=1

Proof. The proof is a little advanced and thus entirely omitted from this book. What does it mean for one random variable to “converge in distribution” to another? This is a little beyond the scope of the A-levels, but informally, this means that as n → ∞, n

the random variable ∑ Xi becomes “ever more” like the random variable with distribution N (nµ, nσ ). 2

i=1

One big use of the CLT is this: If n is “large enough”, then the sum of n independent, identically-distributed random variables can be approximated by a normal distribution. How large is “large enough”? The most common rule-of-thumb is that n ≥ 30 is “large enough”, so that’s what we’ll use in this book, even though this is somewhat arbitrary.

Indeed, if the original distribution from which the random variables are drawn are not “nice enough”, then n ≥ 30 may not be “large enough”. (Informally, a distribution is “nice enough” if it is — among other things — fairly symmetric, fairly unimodal, and not too skewed.) You can safely assume that all distributions you’ll ever encounter in the A-levels are “nice enough”, so that the n ≥ 30 rule-of-thumb works. But whenever you use the CLT normal approximation, you should be clear to state that you assume the distribution is “nice enough”.

Page 647, Table of Contents

www.EconsPhDTutor.com

Example 575. Let X be the random variable that is the sum of 100 rolls of a fair die. From our earlier work, we know that each die roll has mean 3.5 and variance 35/12. Problem: Find P(X ≥ 360) and P(X > 360).

The CLT says that since n = 100 ≥ 30 is large enough and the distribution is “nice enough” (we are assuming this), the random variable X can be approximated by the normal random variable Y ∼ N (100 × 3.5, 100 × 35/12) = N (350, 3500/12). Now, in using Y as an approximation for X, we might be tempted to simply write P(X ≥ 360) ≈ P(Y ≥ 360) and P(X > 360) ≈ P(Y > 360). Note however that X is a discrete random variable, so that P(X ≥ 360) ≠ P(X > 360). More specifically, P(X ≥ 360) = P(X = 360) + P(X > 360).

In contrast, Y is a continuous random variable, so that P(Y ≥ 360) = P(Y > 360). Hence, if we simply use the approximations P(X ≥ 360) ≈ P(Y ≥ 360) and P(X > 360) ≈ P(Y > 360), then implicitly we’d be saying that P(X = 360) = 0, which is blatantly false. To correct for this, we perform the so-called continuity correction. This says that we’ll instead use the approximations P(X ≥ 360) ≈ P(Y ≥ 359.5) and P(X > 360) ≈ P(Y ≥ 360.5). Thus, P(X ≥ 360) ≈ P(Y ≥ 359.5) ≈ 0.2890 (calculator) and P(X > 360) ≈ P(Y ≥ 360.5) ≈ 0.2693.

Page 648, Table of Contents

www.EconsPhDTutor.com

Continuity Correction. If X is a discrete random variable that is to be approximated by a continuous random variable Y , then • P (X ≥ k) ≈ P (Y ≥ k − 0.5), • P (X ≤ k) ≈ P (Y ≤ k + 0.5), • P (X > k) ≈ P (Y > k + 0.5), • P (X < k) ≈ P (Y < k − 0.5). Note that if the random variable to be approximated is itself continuous, then there is no need to perform the continuity correction. This is illustrated in Exercise 253 below. Exercise 252. Let X be the random variable that is the sum of 30 rolls of a fair die. Find P(100 ≤ X ≤ 110). (Answer on p. 1108.) Exercise 253. The weight of each Coco-Pop is independently- and identically-distributed with mean 0.1 g and variance 0.004 g2 . A box of Coco-Pops has exactly 5, 000 Coco-Pops. It is labelled as having a net weight of 500 g. Find the probability that that the actual net weight of the Coco-Pops in this box is less than or equal to 499 g. (Answer on p. 1108.)

Page 649, Table of Contents

www.EconsPhDTutor.com

68.3.1

Normal Approximation to the Binomial Distribution

SYLLABUS ALERT This is in the 9740 (old) syllabus, but not in the 9758 (revised) syllabus. So you can skip this subsection if you’re taking 9758. The binomial distribution is discrete. Thus, when using the normal distribution to approximate it, we must use the continuity correction. Example 576. Flip a fair coin 1, 000 times. Let X be the number of heads. Find the probability that there are more than 510 heads. The CLT says that since n = 1000 ≥ 30 is large enough and the distribution is “nice enough” (we are assuming this), X can be approximated by the normal random variable Y ∼ N (1000 × 0.5, 1000 × 0.25) = N (500, 250). Thus, using also the continuity correction, we have P (X > 510) ≈ P (Y > 510.5) ≈ 0.2635 (calculator). This turns out to be a decent approximation because the exact probability, computed using the binomial distribution, is P (X > 510) ≈ 0.2533. Exercise 254. In a school of 1000 students, each student has probability 0.9 of passing the A-level H2 Maths exam. Find the probability that there are at least 920 passes in the school. (You may make additional assumptions, but you should state them.) (Answer on p. 1108.)

Page 650, Table of Contents

www.EconsPhDTutor.com

68.3.2

Normal Approximation to the Poisson Distribution

SYLLABUS ALERT This is in the 9740 (old) syllabus, but not in the 9758 (revised) syllabus. So you can skip this subsection if you’re taking 9758. The Poisson distribution is discrete. Thus, when using the normal distribution to approximate it, we must use the continuity correction. Example 577. Suppose the monthly number of murders in Singapore can be modelled by X ∼ Po (0.1). Assuming that the month-to-month numbers of murders are independent, find the probability that there are more than 10 murders in a given 10-year span.

The number of murders in 10 years is Y = X1 + X2 + ⋅ ⋅ ⋅ + X120 , where X1 ∼ Po (0.1), X2 ∼ Po (0.1), . . . , X120 ∼ Po (0.1) are the numbers of murders in each of the 120 months. The CLT says that since n = 120 ≥ 30 is large enough and the distribution is “nice enough” (we are assuming this), Y can be approximated by the normal random variable T ∼ N (120 × 0.1, 120 × 0.1) = N (12, 12). Thus, using also the continuity correction, we have P(Y > 10) ≈ P(T > 10.5) ≈ 0.6675.

This is a decent approximation, because the exact probability, computed using the Poisson distribution, is P(Y > 10) ≈ 0.6528. Exercise 255. Suppose the daily number of fatalities from motor vehicle accidents can be modelled by X ∼ Po(0.5). Find the probability that there are more than 200 fatalities from car accidents in any given year. (You may make additional assumptions, but you should state them.) (Answer on p. 1109.)

Page 651, Table of Contents

www.EconsPhDTutor.com

69

The CLT is Amazing (Optional)

The Fundamental Theorems of Calculus and the CLT are the most profound and amazing results you’ll learn in H2 maths. This chapter briefly explains why the CLT is so amazing and why the normal distribution is ubiquitous.

69.1

The Normal Distribution in Nature

The normal distribution is ubiquitous in nature. The classic example is human height.75 Example 578. Below is a histogram of the heights of the 4,060 NBA players who ever played in an NBA game (through the end of the 2016 season). (Heights are reported in feet and a whole number of inches, where 1 in = 2.54 cm and 1 ft = 12 in, so that 1 ft = 30.48 cm.) The histogram has 28 bins and (arguably) looks normal (bell-shaped). The width of each bin is 1 inch. For example, the red bin says 410 players have had reported heights of 6 ft 7 in (approx. 200 cm). The pink (leftmost) bin is barely visible and says only 1 player has had a reported height of 5 ft 3 in (approx. 160 cm). The blue (rightmost) bin is also barely visible and says that only 2 players have had reported heights of 7 ft 7 in (approx. 231 cm). The average or mean height is approx. 6 ft 6 in (approx. 198 cm).

75

Data: Excel spreadsheet. Source: Basketball-Reference.com (retrieved June 15th, 2016). Caveats: (1) For some reason, out of the 4060 players in that database at the time of retrieval, there was exactly one player (George Karl) whose height was not listed. Wikipedia lists George Karl’s height as 6 ft 2 in, so that is what I have used for his height. (2) By NBA, I actually mean the BAA (1946-1949), the NBA (1949-present), and the ABA (1967-1976), combined. (3) As is well-known among basketball fans, the listed heights of NBA players are not accurate and can sometimes be off by as much as 2 to 3 inches (5 to 7.5 cm). (See this recent Wall Street Journal article.)

Page 652, Table of Contents

www.EconsPhDTutor.com

Manute Bol (approx 231 cm) and Muggsy Bogues (approx 160 cm) were briefly on the same team. (YouTube highlights.)

An infamous example of the normal distribution concerns human intelligence:

Page 653, Table of Contents

www.EconsPhDTutor.com

Example 579. The 1994 book The Bell Curve: Intelligence and Class Structure in American Life was named after the observation that the Intelligence Quotient (IQ) seems to be normally-distributed. This observation was neither new nor controversial (though some scholars would dispute the usefulness of IQ measures).

What made the book especially controversial were its claims that intelligence was largely heritable and that black Americans had lower intelligence than whites. The figure above is taken from p. 279 of the book. It suggests that • Black IQ is normally distributed, with a mean of around 80. • White IQ is normally distributed, with a mean of around 105.

Another example — the Galton box:

Page 654, Table of Contents

www.EconsPhDTutor.com

Example 580. (Galton box.) Small beans are released from the top, through a narrow passage. There are numerous pegs that tend to randomly divert the path of the beans. At the bottom of the box, there are many different narrow slots of equal width, into which the beans can drop. The beans will tend to form a bell shape at the bottom.

Beans just released.

Pegs divert beans.

Beans fill slots.

(Source: YouTube.)

Question: Many things in nature seem to be normally distributed. Why? We will try to answer this question, but only after we’ve illustrated how the Central Limit Theorem works.

Page 655, Table of Contents

www.EconsPhDTutor.com

69.2

Illustrating the Central Limit Theorem (CLT) (CLT)

Example random variables variables T T111,, Example 581. 581. Flip Flip many many fair fair coins. coins. Model Model each each with with the the Bernoulli Bernoulli random TT22,, TT333,, .. .. .. ,, each each with with probability probability of of success success (heads) (heads) 0.5. 0.5.

Let Let X Xnnn == TT111 ++ TT222 + + ⋅⋅ ⋅⋅ ⋅⋅ + +T Tnnn ∼∼ B B (n, (n, 0.5). 0.5).

Below X111 has has probability probability Below are are the the histograms histograms of of the the distributions distributions of X111, X222, . . . , and X666. X 0.5 taking on on the the value value 0.5 of of taking taking on on each each of of the the values values 00 and and 1. 1. X X22 has probability 0.5 of taking of Etc. of 1; 1; and and probability probability of of 0.25 0.25 of of taking taking on on each each of the values 0 and 2. Etc.

(... Example Example continued on the next page ...) (... Page 656, 656, Table Table of of Contents Contents Table of Contents Page Page 656,

www.EconsPhDTutor.com www.EconsPhDTutor.com

(... (... Example continued from the previous page ...) Example continued On On this the next next page this and and the 10, X20 20, page are the histograms of the distributions of X777, X888, X99, X10 X X30 and X X100 X40 X50 30 40 50 100 30,, X 40,, X 50,, and 100. Observe that as n grows, the shape of the probability distribution of of X ever more more bell-shaped. This is exactly what the CLT says. Xnnn looks looks ever

(... Example continued on the next page ...) (... Example continued

Page 657, 657, Table Table of of Contents Contents Page

www.EconsPhDTutor.com

(... Example continued from the previous page ...)

The CLT says the following: 1. Draw a sufficiently-large number of independent and identical random variables variables from from 76 ANY distribution. 2. Add them up to get another random variable S. 3. The probability distribution of S will look normal. 76 76 76 I

finite. I should say nearly any distribution. For the classical CLT to apply, the variance variance must must be be finite.

Page 658, Table of Contents

www.EconsPhDTutor.com www.EconsPhDTutor.com

What What makes makes the the CLT CLT particularly particularly amazing amazing is is that that it it works works with with ANY distribution. distribution. To To illustrate, illustrate, next next up up is is an an example example where where the the original original distribution distribution is is highly highly skewed and and does does not not look look at at all all bell-curved. bell-curved. Nonetheless, Nonetheless, the the CLT CLT still still works works out out nicely. nicely. Example of heads. heads. Model Model each each Example 582. 582. Flip Flip many many biased biased coins, coins, each each with with probability probability 0.9 0.9 of with of success success (heads) (heads) with the the Bernoulli Bernoulli random random variables variables Y Y11 ,, Y Y22 ,, Y Y33 ,, .. .. .. ,, each each with with probability probability of 0.9. 0.9. Let Let S Snnn == Y Y111 + +Y Y222 + + ⋅⋅ ⋅⋅ ⋅⋅ + +Y Ynn be the the number of of heads in in the first first n coin-flips. coin-flips. (By the the way, way, S Snnn ∼∼ B B (n, (n, 0.9).) 0.9).)

On 10 .. On this this and and the the next next page page are are the the histograms histograms of of the the distributions distributions of of S S11 ,, S S22 ,, .. .. .. ,, and and S S10 S S111 has has probability probability 0.1 0.1 of of taking taking on on value value 00 and and 0.9 0.9 of of taking taking on on value value 1. 1. S S22 has has probability probability 0.01 0.81 of of taking taking on on the the 0.01 of of taking taking on on the the value value of of 0, 0, 0.18 0.18 of of taking taking on on the the value value 1, 1, and and 0.81 value of taking taking on on the the value value value 2. 2. S S333 has has probability probability 0.001 0.001 of of taking taking on on the the value value of of 0, 0, 0.036 0.036 of 1, of taking taking on on the the 1, 0.486 0.486 of of taking taking on on the the value value 2, 2, 0.2916 0.2916 of of taking taking on on the the value value 3, 3, 0.6561 0.6561 of value value 4. 4. Etc. Etc.

...) (... Example Example continued continued on on the the next next page page ...) (...

Page Page 659, 659, Table Table of Page 659, Table of Contents Contents

www.EconsPhDTutor.com www.EconsPhDTutor.com

(... (... Example Example continued continued from from the the previous previous page page ...) ...)

It certainly does not look like the distribution Snn is becoming increasingly bell-curved. bell-curved. Well, let’s see. (... Example continued on the next page ...)

Page Page 660, 660, Table Table of Page 660, Table of Contents Contents

www.EconsPhDTutor.com www.EconsPhDTutor.com

(... Example continued from the previous page ...) Below are the histograms of the distributions of S20 20, S30 30, S40 40, S50 50, and S100 100. Remarkably enough, as n grows, the shape of the probability distribution of Snn looks ever more bellshaped. As promised by the CLT.

Page 661, 661, Table Table of of Contents Contents Page

www.EconsPhDTutor.com www.EconsPhDTutor.com

69.3

Why Are So Many Things Normally Distributed?

We now return to the question posed earlier: Many things in nature seem to be normally distributed. Why? This is a deep question. The standard quick answer is this: If S is the sum of a very large number of independent random variables, then by the CLT S is (approximately) normally distributed. Examples to illustrate: Example 583. Assume that human height is entirely determined by 1000 independent genes (assume all human beings have these 1000 genes). Assume that each of these 1000 genes is associated with an independent random variable 2 X1 , X2 , . . . , X1000 , each identically distributed with mean µX and variance σX . Assume also that human height is simply equal to the sum of these random variables. That is, a human being’s height is simply given by H = X1 + X2 + ⋅ ⋅ ⋅ + X1000 . Then the CLT says that since n = 1000 is “large enough”, H will be approximately normally 2 distributed, with mean 1000µX and variance 1000σX . Amongst the world’s 7.4 billion people, there will be some very short people and some very tall people, but most people will be near the mean height 1000µX .

Example 584. Ah Kow’s Mooncake Factory manufactures mooncakes. Each mooncake is supposed to weigh exactly 185 g, if the standard recipe is followed with absolute precision. However, the exact weight of each mooncake will usually vary, due to myriad factors, such as whether the baker was paying attention, how much water the baker added, how long the mooncake was left in the oven, the room temperature that day, etc. Assume there are 300 independent factors that determine the exact weight of a mooncake. Assume that each of these 300 factors is associated with an independent random variable Y1 , Y2 , . . . , Y300 , each identically distributed with mean µY and variance σY2 . Assume also that the weight of a mooncake is simply given by W = Y1 + Y2 + ⋅ ⋅ ⋅ + Y300 .

Then the CLT says that since n = 300 is “large enough”, W will be approximately normally distributed, with mean 300µY and variance 300σY2 . Amongst the millions of mooncakes produced, there will be some very light mooncakes and some very heavy mooncakes, but most mooncakes will be near the mean weight 300µY .

Page 662, Table of Contents

www.EconsPhDTutor.com

69.4

Don’t Assume That Everything is Normal

Mathematical modellers often assume that “everything is normal”. There are three justifications for this: 1. We have strong empirical evidence that many things in nature are normally-distributed. 2. We have a strong theoretical reason (the CLT) for why this might be so. 3. The normal distribution is easy to handle (because e.g. the maths is easy, compared to some other distributions). However, many things are not normally-distributed. It is thus a mistake to assume that “everything is normal”. One example of a common but non-normal distribution found in nature is the Pareto distribution. We’ll skip the formal details. Informally, it is called the Pareto Principle or the 80-20 Rule and businesspersons say things like: “80% of your output is produced by 20% of your employees.” “80% of your sales come from 20% of your clients.” It is believed the Pareto distribution is a good description of many aspects of human performance (though apparently not of height or IQ). By the way, it was named after Vilfredo Pareto, who in 1896 found that approximately 80% of the land in Italy was owned by 20% of the population. Let’s see if the points scored in the NBA resembles the Pareto distribution.77 In particular, is it the case that 20% of NBA players have scored 80% of the points?

77

Source: Basketball-Reference.com. Caveats: (1) The data were retrieved on June 15th, 2016, so the points scored are between 1946 and that date. (2) By NBA, I actually mean the BAA (1946-1949), the NBA (1949-present), and the ABA (1967-1976), combined. Data: Excel spreadsheet.

Page 663, Table of Contents

www.EconsPhDTutor.com

Example 585. Below is a histogram of the total points scored by each of 4,060 NBA players. Clearly, the total points scored by each player is not normally distributed. The histogram has 20 bins of equal width. The leftmost bin says that 2, 615 players scored 0 to 1919 points. The rightmost bin says that only 2 players scored 36, 468 to 38, 387 points. The grand total number of points ever scored in the NBA is 11, 565, 923. Of which, 8, 424, 242 (or 72.8%) were scored by the top 20% (812). So it appears that the 80-20 Rule is a reasonably good description of the distribution of total points scored by players! In contrast, the normal distribution is obviously not a good description.

It’s fairly obvious to anyone who bothers graphing the data that “points scored in the NBA” is not normally-distributed. There are however instances where this is less obvious. One is thus more likely to mistakenly assuming a normal distribution. A famous and tragic example of this is given by the financial markets.

Page 664, Table of Contents

www.EconsPhDTutor.com

Example 586. The Dow Jones Industrial Average (DJIA) is one of the world’s leading stock market indices. It is a weighted average of the share prices of 30 of the largest US companies (e.g. Apple, Coca-Cola, McDonald’s). Trading starts in the morning and closes in the afternoon (right now, the trading hours are 9:30 am to 4 pm). The next graph is of the daily closing values for the past 30 years.

DJIA (1,000s), Daily Close, 16/06/1986 - 15/06/2016 20 Red vertical lines indicate first trading day of each year.

18 16 14 12 10 8 6 4 2 0

1987

1991

1995

1999

2003

2007

2011

2015

Let qi be the % change in closing value on day i, as compared to day i − 1. For example, on June 14th, 2016, the DJIA closed at 17, 674.82. On June 15th, 2016, it closed at 17, 640.17, 34.65 points lower than the previous day’s close. Thus, q20160615 =

−34.65 ≈ −0.20%. 17, 674.82

(... Example continued on the next page ...)

Page 665, Table of Contents

www.EconsPhDTutor.com

(... Example continued from the previous page ...) The graph here is of q, on 36, 044 consecutive trading days (over 131 years). In black are those days when the DJIA rose; in red are when it fell. Can you spot the single largest one-day fall in the DJIA? (We’ll talk about this singular day shortly.)

% Daily Change in DJIA (q), 17/02/1885 - 15/06/2016 16% Green vertical lines mark the first trading day of years ending in 0.

12% 8% 4% 0% -4% -8% -12% -16% -20%

1 8 9 0

1 9 1 0

1 9 3 0

1 9 5 0

1 9 7 0

1 9 9 0

2 0 1 0

-24%

(... Example continued on the next page ...)

Page 666, Table of Contents

www.EconsPhDTutor.com

(... Example continued from the previous page ...) The graph here is also of q, but in the form of a histogram. Each bin has width 0.1% (except the leftmost and rightmost bins). For example, on 2, 204 days (out of 36, 044), q ∈ (−0.1%, 0%] (the DJIA fell by between 0.1% and 0%).

On 78 days, q ≤ −5% (the DJIA fell by more than 5%). On 70 days, q > 5% (the DJIA rose by more than 5%).

It seems reasonable to say that q is normally-distributed (at least if we ignore the leftmost and rightmost bins). (... Example continued on the next page ...)

Page 667, Table of Contents

www.EconsPhDTutor.com

(... Example continued from the previous page ...) The sample mean and standard deviation (from the 36, 044 observations) are µ ≈ 0.023% and σ ≈ 1.064%. So let’s suppose q were normally-distributed with mean µ and variance σ2. If so, then we’d predict (as per the properties of the normal distribution) that:

1. 0.6827 of the time, q is within 1 standard deviation of the mean, i.e. q ∈ (−1.04%, 1.09%).

2. 0.954 of the time, q is within 2 standard deviations of the mean, i.e. q ∈ (−2.10%, 2.15%). As it actually turned out,

1. 0.7965 of the time (28, 709 out of 36, 044 days), q was within 1 standard deviation of the mean. (A bit off, but not too bad.) 2. 0.9536 of the time (34, 373 out of 36, 044 days), q was within 2 standard deviations of the mean. (Almost exactly correct!) In addition to the above “evidence”, we might make the following theoretical argument: Share prices are affected by a myriad random and arguably-independent factors. Hence, by the CLT, we’d expect share prices (and thus q as well) to be normally-distributed. Unfortunately, modelling q as a normal random variable would be a disastrous mistake, especially when it comes to predicting rare events. If q is indeed normally distributed, then we’d predict that: The DJIA rises or falls by more than ... 1. ... 5% less than once every 2, 000 years. 2. ... 7% less than once every 100 million years. 3. ... 10% less than once every 480, 000, 000 billion years. (For comparison, the universe is estimated to be 13.8 billion years old.) And so, during the 131 years for which we have data, it should have been very unlikely that the DJIA ever rose or fell by more than 5%. But as it actually turned out, during these 131 years, the DJIA rose or fell by more than ... 1. ... 5% 148 times. 2. ... 7% 40 times. 3. ... 10% 10 times.

(Data: Excel spreadsheet.)

Page 668, Table of Contents

www.EconsPhDTutor.com

70

Statistics: Introduction (Optional) 70.1

Probability vs. Statistics

Probability Given a known model, what can we say about the data we’ll observe?

Statistics Given observed data, what can we say about the model?

Example 587. Let p be the probability of a coin-flip turning up heads. (p is an example of a parameter.) Flip a coin thrice. Probability question: “Suppose we know that p = 1/2. Then what can we say about the probability of observing three heads (i.e. P (HHH))?” (For most probabilists, this is a simple question with a straightforward answer: P (HHH) = 1/8.) Statistics question: “Suppose we observe HHH. Then what can we say about p?” (Different statisticians will give different answers.) In the real world, we will almost never know what p “truly” is. Instead, we usually only have some limited data observations (such as observing HHH). Probability is about making heroic assumptions about what p is, in order to draw inferences about what the observed data will look like. In contrast, statistics is about using limited, observed data to draw statistical inferences about the model and its parameters.

Page 669, Table of Contents

www.EconsPhDTutor.com

70.2

Objectivists vs Subjectivists

Example 588. Ann and Bob are two infinitely-intelligent persons. Ann believes that the probability of rain tomorrow is 0.2 and Bob believes that it is 0.6. • Objectivist view: There is some single, “correct” probability p of rain tomorrow. Perhaps no one (except some Supreme Being up above) will ever know what exactly p is. But in any case, we can say that exactly one of the following must be true: 1. Ann is correct (and Bob is wrong); 2. Bob is correct (and Ann is wrong); or 3. Both Ann and Bob are wrong. • Subjectivist view: A probability is not some objective, rational thing that exists outside the mind of any human being. There is no “correct” probability. Instead, a probability is merely the degree of belief in the occurrence of an event attributed by a given person at a given instant and with a given set of information. (De Finetti, infra, pp. 3-4.) Thus, Ann and Bob can legitimately disagree about the probability of rain tomorrow, without either being wrong. After all, the numbers 0.2 and 0.6 are merely their personal, subjective degrees of belief in the likelihood of rain tomorrow. Bruno de Finetti (1906-1985) was perhaps the most famous and extreme subjectivist ever. In a preface to a book, he provocatively declared, with his CAPSLOCK key stuck, that PROBABILITY DOES NOT EXIST.78 In this textbook (and for the A-levels), we will be strict objectivists. The main practical implications of being an objectivist are illustrated in the following examples:

78

Theory of Probability, v. 1, 1990 edition, p. x. (Originally published as Teoria delle probabilità in 1970.)

Page 670, Table of Contents

www.EconsPhDTutor.com

Example 589. Judge Ann says the murder suspect is probably innocent. Judge Bob says the suspect is probably guilty. Objectivist interpretation: Ann and Bob cannot both be correct. The suspect is either innocent (with probability 1) or guilty (with probability 1). In fact, we can go even further and say that both Ann and Bob are talking nonsense. It is nonsensical to say things like the suspect is “probably” innocent (or “probably” guilty), because the suspect either is innocent or not. Subjectivist interpretation: Ann and Bob are perfectly well-entitled to their beliefs. Moreover, it is perfectly meaningful to say things like the suspect is “probably” innocent (or “probably” guilty). Ann and Bob do not know for sure whether the suspect is innocent or guilty. They are thereby perfectly well-entitled to speak probabilistically about the innocence or guilt of the suspect.

Example 590. We flip a coin 100 times and get 100 heads. Given these observed data (100 heads out of 100 flips), what can we say (what statistical inference can we make) about whether or not the coin is fair? Subjectivist answer: The coin is probably not fair. (This is perhaps the answer that most laypersons would give.) Objectivist answer: The coin either is fair (with probability 1) or isn’t fair (with probability 1). Subjectivist statements like the coin is “probably” not fair are nonsensical. Most untrained laypersons are innately subjectivist. Yet in this book (and also for the A-levels), you’ll be trained to think like strict objectivists. Note though that it is not the case that one school of thought is correct and the other wrong. Both the objectivist and subjectivist schools of thought have merit. The growing consensus amongst statisticians is to take the best of both worlds. Nonetheless, in this textbook, we learn only the objectivist interpretation. Not because it is necessarily superior, but rather because 1. The maths is easier. 2. Tradition: For most of the 20th century, the objectivist interpretation was favoured.

Page 671, Table of Contents

www.EconsPhDTutor.com

71

Sampling

71.1

Population

Definition 131. A population is any ordered set (i.e. vector) of objects we’re interested in. A population can be finite or infinite. But to keep things simple, we’ll look at examples where it is finite. Example 591. The two candidates for the 2016 Bukit Batok SMC By-Election are Dr. Chee Soon Juan and PAP Guy. It is the night of the election and voting has just closed.

Our objects-of-interest are the 23, 570 valid ballots cast. (A ballot is simply a piece of paper on which a vote is recorded. The words ballot and vote are often used interchangeably.) Arrange the ballots in any arbitrary order. Let v1 = 1 if the first ballot is in favour of Dr. Chee and v1 = 0 otherwise. Similarly and more generally, for any i = 2, 3, . . . , 23570, let vi = 1 if the ith ballot is in favour of Dr. Chee and v1 = 0 otherwise.

Our population here is simply the ordered set P = (v1 , v2 , . . . , v23570 ). So in this example, the population is simply an ordered set of 1s and 0s.

Page 672, Table of Contents

www.EconsPhDTutor.com

71.2

Population Mean and Population Variance

The population mean µ is simply the average across all population values. The population variance σ 2 is a measure of the variation across all population values. Formally:79 Definition 132. Given a finite population P = (v1 , v2 , . . . , vk ), the population mean µ and population variance σ 2 are defined by k k (v1 − µ) + (v2 − µ) + ⋅ ⋅ ⋅ + (vk − µ) ∑i=1 (vi − µ) ∑i=1 vi v1 + v2 + ⋅ ⋅ ⋅ + vk 2 = and σ = = . µ= k k k k 2

2

2

2

Example 672 (continued from above). Suppose that of the 23, 570 votes, 9, 142 were for Dr. Chee and the remaining against. So the vector (v1 , v2 , . . . , v23570 ) contains 9, 142 1s and 14, 428 0s.

Then the population mean is µ=

v1 + v2 + ⋅ ⋅ ⋅ + vn 9142 × 1 + 14428 × 0 9142 = = ≈ 0.3879. n 23570 23570

In this particular example, the population values are binary (either 0 or 1). And so we have a nice alternative interpretation: the population mean is also the population proportion. In this case, it is the proportion of the population who voted for Dr. Chee. So here the proportion of votes for Dr. Chee is about 0.3879. The population variance is 9142 9142 2 2 2 ) + 14428 ⋅ (0 − 23570 ) 9142 ⋅ (1 − 23570 (v1 − µ) + (v2 − µ) + ⋅ ⋅ ⋅ + (vn − µ) 2 σ = = ≈ 0.2374. n 23570 2

2

As usual, the variance tells us about the degree to which the vi ’s vary. Of course, in this example, we already know that the vi ’s can take on only two values — 0 and 1. So the variance isn’t terribly interesting or informative in this example. In particular, it doesn’t tell us anything more that the population mean didn’t already tell us (indeed, it can be shown that in this example, σ 2 = µ − µ2 ). 79

In the case of an infinite population, the definitions of µ and σ 2 must be adjusted slightly, but the intuition is the same.

Page 673, Table of Contents

www.EconsPhDTutor.com

71.3

Parameter

Informally, a parameter is some number we’re interested in and which may be calculated based on the population. Example 672 (continued from above). A parameter we might be interested in is the population mean µ — this is also the proportion of votes in favour of Dr. Chee. (Another parameter we might be interested in is the population variance σ 2 , but let’s ignore that for now.) Voting has just closed. In a few hours’ time (after the vote-counting is done), we will know what exactly µ is. But right now, we still don’t know what µ is. Suppose we are impatient and want to know right away what µ might be. In other words, suppose we want to get an estimate of the true value of µ. What are some possible methods of getting a quick estimate of µ? One possibility is to observe a random sample of 100 votes and count the proportion of these 100 votes that are in favour of Dr. Chee. So for example, say we do this and observe that 39 out of the 100 votes are for Dr. Chee. That is, we find that the observed sample mean (which in this context can also be called the observed sample proportion) is 0.39. Then we might conclude: Based on this observed random sample of 100 votes, we estimate that µ is 0.39. The layperson might be content with this. But the statistician digs a little deeper and asks questions such as: • How do we know if this estimate is “good”? • What are the criteria to determine whether an estimate is “good”? We’ll now try to address, if only to a limited extent, these questions. But to do so, we must first precisely define terms like sample and estimate.

Page 674, Table of Contents

www.EconsPhDTutor.com

71.4

Distribution of a Population

Informally,80 the distribution of a population tells us 1. The range of possible values taken on by the objects in the population; and 2. The proportion of the population that takes on each possible value. Example 672 (continued from above). The population is P = (v1 , v2 , . . . , v23570 ), the ordered set of 23570 ballots. Suppose that of these, 9, 142 are votes for Dr. Chee (hence recorded as 1s) and the remaining 14, 428 are for PAP Guy (hence recorded as 0s). Then the distribution of the population can informally be described in words as: • A proportion 9142/23570 of the population are 1s, and • A proportion 14428/23570 of the population are 0s.

Example 592. The population is P = (3, 4, 7, 7, 2, 3).

Then the distribution of the population can informally be described in words as: • A proportion 1/6 of the population are 2s; • A proportion 2/6 of the population are 3s; • A proportion 1/6 of the population are 4s; and • A proportion 2/6 of the population are 7s.

80

Formally, we’d define the population distribution as a function. Indeed, some writers define the population itself as the distribution function.

Page 675, Table of Contents

www.EconsPhDTutor.com

71.5

A Random Sample

Informally, to observe a random sample of size n, we follow this procedure: Imagine the 23, 570 ballots are in a single big bag. 1. Randomly pull out one ballot. Record the vote (either we write x1 = 1, if the vote was for Dr. Chee, or we write x1 = 0, if it wasn’t). 2. Put this ballot back in (this second step is why we call it sampling with replacement).

3. Repeat the above n times in total, so as to record down the values of x1 , x2 , . . . , xn .

We call (x1 , x2 , . . . , xn ) an observed random sample of size n. Note that this is an ordered set (or vector) of numbers. Formally: Definition 133. Let P be a population. Then the random vector (i.e. ordered set of random variables) (X1 , X2 , . . . , Xn ) is a random sample of size n from the population P if • X1 , X2 , . . . , Xn are independent; and

• X1 , X2 , . . . , Xn are identically-distributed, with the same distribution as P .

As always, we must be careful to distinguish between a function and a value taken on by the function. This table summarises.

Function Value taken by the function f is a function f (x) is a possible value taken on by the function X is a random variable x is a possible observed value of the random variable (X1 , X2 , . . . , Xn ) is a random sample (x1 , x2 , . . . , xn ) is a possible observed random sample An example to illustrate:

Page 676, Table of Contents

www.EconsPhDTutor.com

Example 672 (continued from above). To repeat, the distribution of the population P = (v1 , v2 , . . . , v23570 ) can informally be described in words as:

• 9142/23570 of the population were 1s; and • 14428/23570 of the population were 0s.

Let X1 , X2 , and X3 be independent random variables, each with the same distribution as the population. That is, for each i = 1, 2, 3, P (Xi = 0) =

14428 23570

and P (Xi = 1) =

9142 . 23570

The ordered set (or vector) (X1 , X2 , X3 ) is a random sample of size 3.

An example of an observed random sample of size 3 might be (x1 , x2 , x3 ) = (1, 1, 0) — this would be where we randomly sample 3 ballots (with replacement) and find that the first two are votes for Dr. Chee but the third is not. Another example of an observed random sample of size 3 might be (x1 , x2 , x3 ) = (0, 0, 0) — this would be where we randomly sample 3 ballots (with replacement) and find that none of the three are for Dr. Chee. As another example, (X1 , X2 , X3 , X4 , X5 ) is a random sample of size 5.

An example of an observed random sample of size 5 might be (x1 , x2 , x3 , x4 , x5 ) = (0, 1, 0, 1, 0) — this would be where we randomly sample 5 ballots (with replacement) and find that only the second and fourth are votes for Dr. Chee. Another example of an observed random sample of size 5 might be (x1 , x2 , x3 , x4 , x5 ) = (1, 1, 0, 1, 1) — this would be where we randomly sample 5 ballots (with replacement) and find that only the third is not a vote for Dr. Chee. In this textbook, we’ll be very careful to distinguish between a random sample (which is a vector of random variables) and an observed random sample (which is a vector of real numbers). This may be contrary to the practice of your teachers or indeed even the A-level exams.

Page 677, Table of Contents

www.EconsPhDTutor.com

71.6

Sample Mean and Sample Variance

Definition 134. Let (X1 , X2 , . . . , Xn ) be a random sample of size n. Then the corre¯ and the sample variance S 2 are the random variables defined sponding sample mean X by: ¯ = X 1 + X 2 + ⋅ ⋅ ⋅ + Xn , X n S2 =

¯ 2 + (X2 − X) ¯ 2 + ⋅ ⋅ ⋅ + (Xn − X) ¯ 2 (X1 − X) n−1

n ¯ 2 ∑i=1 (Xi − X) = . n−1

(The List of Formulae you get during exams will contain the observed sample variance.) Note that strangely enough, the denominator of S 2 is n − 1, rather than n as one might expect. As we’ll see later, there is a good reason for this. By the way, there are two other formulae for calculating the sample variance: ¯ be the sample Fact 81. Let S = (X1 , X2 , . . . , Xn ) be a random sample of size n. Let X 2 mean and S be the sample variance. Let a ∈ R be a constant. Then [∑n i=1 Xi ] n

∑i=1 Xi2 − 2 (a) S = n−1 n

2

[∑ (X −a)] n ∑i=1 (Xi − a) − i=1 n i 2 and (b) S = . n−1 2

n

2

(The List of Formulae has a but not b.)

Proof. Optional, see p. 925 in the Appendices.

Page 678, Table of Contents

www.EconsPhDTutor.com

Once again, it is important to distinguish between ¯ (a random variable) vs. the observed sample mean x¯ (a real • The sample mean X number). • The sample variance S 2 (a random variable) vs. the observed sample variance s2 (a real number). Example 672 (continued from above). Let (X1 , X2 , X3 ) be a random sample of size 3. ¯ and sample variance S 2 are these random variables: The corresponding sample mean X ¯ = X1 + X2 + X3 , X 3

S2 =

¯ 2 + (X2 − X) ¯ 2 + (X3 − X) ¯ 2 (X1 − X) 3−1

.

Suppose our observed random sample of size 3 is (1, 0, 0). Then the corresponding observed sample mean x¯ and observed sample variance s2 are these real numbers: x¯ =

x1 + x2 + x3 1 + 0 + 0 1 = = , n 3 3

2 2 2 (1 − 13 ) + (0 − 31 ) + (0 − 31 ) (x1 − x¯) + (x2 − x¯) + (x3 − x¯) 1 s = = = . n−1 3−1 3 2

2

2

2

¯ Let (X1 , X2 , X3 , X4 , X5 ) be a random sample of size 5. The corresponding sample mean X and sample variance S 2 are these random variables: ¯ = X 1 + X 2 + X 3 + X4 + X5 , X 5

S2 =

¯ 2 + (X2 − X) ¯ 2 + ⋅ ⋅ ⋅ + (X5 − X) ¯ 2 (X1 − X) 5−1

.

Suppose our observed random sample of size 5 is (0, 1, 0, 0, 1). Then the corresponding observed sample mean x¯ and observed sample variance s2 are these real numbers: x¯ =

x1 + x2 + x3 + x4 + x5 0 + 1 + 0 + 0 + 1 2 = = = 0.4, n 5 5

(x1 − x¯) + (x2 − x¯) + (x3 − x¯) + (x4 − x¯) + (x5 − x¯) s = n−1 2 2 2 1 2 1 2 (0 − 5 ) + (1 − 5 ) + (0 − 51 ) + (0 − 15 ) + (1 − 51 ) = = 0.35. 5−1 2

Page 679, Table of Contents

2

2

2

2

2

www.EconsPhDTutor.com

We call a random variable an estimator if it is used to generate estimates (“guesses”) for some parameter. Example: Example 672 (continued from above). It is the night of the election and polling has just closed. We still do not know the true proportion µ that voted for Dr. Chee. We decide to get a random sample of size 3: (X1 , X2 , X3 ). The corresponding sample mean ¯ 3 = (X1 + X2 + X3 ) /3 shall be an estimator for µ. (Informally, an estimator is a method X for generating “guesses” for some unknown parameter, in this case µ.) This estimator is used to generate estimates (“guesses”) for µ. For every observed random sample, the estimator generates an estimate.

Suppose our observed random sample of size 3 is (1, 0, 0). We calculate the corresponding observed sample mean to be x¯ = 1/3. We say that x¯ = 1/3 is an estimate for µ. (By the way, unless we are extremely lucky, it is highly unlikely that the true value of the unknown parameter µ is precisely 1/3. After all, 1/3 is merely an estimate obtained from a single observed random sample of size 3.)

Suppose instead that our observed random sample of size 3 were (0, 1, 1). Then the corresponding observed sample mean would be x¯ = 2/3. We’d instead say that x¯ = 2/3 is our estimate for µ.

There is also more than one estimator we can use. For example, suppose instead that we decide to get a random sample of size 5: (X1 , X2 , X3 , X4 , X5 ). We shall instead use the ¯ = (X1 + X2 + X3 + X4 + X5 ) /5 as our estimator for µ. And corresponding sample mean X so for example suppose our observed random sample of size 5 is is (0, 1, 0, 0, 1). Then the corresponding observed sample mean x¯ = 0.4 and x¯ = 0.4 would be our estimate for µ.

Now, are these estimators and estimates “good” or “reliable”? How much should we trust them? These are questions that we’ll address in the next section.

Page 680, Table of Contents

www.EconsPhDTutor.com

A different example: Example 593. Suppose we wish to find the average height µ (in cm) of an adult male. As a practical matter, it would be quite difficult to locate and record the height of every adult male in the world. So instead, what we might do is to randomly pick 4 adult males and record their heights. This gives us a random sample (H1 , H2 , H3 , H4 ) of heights. The ¯ = (H1 + H2 + H3 + H4 ) /4. H ¯ shall corresponding sample mean is the random variable H serve as our estimator for µ. Suppose our observed random sample is (h1 , h2 , h3 , h4 ) = (178, 165, 182, 175). Then the corresponding observed sample mean is

¯ = h1 + h2 + h3 + h4 = 178 + 165 + 182 + 175 = 175. h n 4

¯ = 175 serves as an estimate (or “guess”) of the true average male height µ. Thus, h

¯ = 175 “good” or “reliable”? How much should ¯ and estimate h Again, are the estimator H we trust them? These are questions that we’ll address in the next section.

Page 681, Table of Contents

www.EconsPhDTutor.com

Example 594. Let X be the random variable that is the height (in cm) of an adult female Singaporean. Our parameters-of-interest are the true population mean µ and true population variance σ 2 of X. We wish to generate estimates for µ and σ 2 . To this end, we get a random sample of size 8: (X1 , X2 , . . . , X8 ). The corresponding sample ¯ = (X1 + X2 + ⋅ ⋅ ⋅ + X8 ) /8 will serve as our estimator for µ. And the corresponding mean X 8

¯ 2 /(8 − 1) will serve as our estimator for σ 2 . sample variance S 2 = ∑ (Xi − X) i=1

(a) Suppose our observed random sample is such that 8

∑ xi = 1, 320 and i=1

8

∑ x2i = 218, 360. i=1

Then the observed sample mean x¯ and the observed sample variance s2 are n ∑i=1 xi 1320 = = 165, x¯ = n 8 (∑n xi )

∑i=1 x2i − i=1n 2 s = n−1 n

2

218360 − 1320 8 = = 80. 7 2

And our estimates for µ and σ 2 are, respectively, 165 cm and 80 cm2 . (b) Suppose instead our observed random sample is such that 8

∑(xi − 160) = 72 and i=1

8

∑ (xi − 160) = 1, 560. 2

i=1

Then the observed sample mean x¯ and the observed sample variance s2 are n n n 72 ∑i=1 xi ∑i=1 (xi − 160 + 160) ∑i=1 (xi − 160) x¯ = = = + 160 = + 160 = 169, n n n 8

∑i=1 (xi − 160) − s2 = n−1 n

2

[∑n i=1 (xi −160)] n

2

1, 560 − 728 = ≈ 130.3. 7 2

And our estimates for µ and σ 2 are, respectively, 169 cm and 130.3 cm2 .

Page 682, Table of Contents

www.EconsPhDTutor.com

Exercise 256. Calculate the observed sample mean and variance for the following observed random sample of size 7: (3, 14, 2, 8, 8, 6, 0). (Answer on p. 1110.) Exercise 257. (Answer on p. 1110.) Let X be the random variable that is the weight (in kg) of an American. Suppose we are interested in estimating the true population mean µ and variance σ 2 of X. We get an observed random sample of size 10: (x1 , x2 , . . . , x10 ). 10

10

i=1

i=1

(a) Suppose you are told that ∑ xi = 1, 885 and ∑ x2i = 378, 265. Find the observed

sample mean x¯ and observed sample variance s2 . 10

10

i=1

i=1 2

(b) Suppose you are instead told that ∑(xi − 50) = 1, 885 and ∑ (xi − 50) = 378, 265. 2

Find the observed sample mean x¯ and observed sample variance s .

Page 683, Table of Contents

www.EconsPhDTutor.com

71.7

Sample Mean and Sample Variance are Unbiased Estimators

Earlier we asked: How do we decide if an estimator and the estimates it generates are “good”? How do we know whether to trust any given estimate? For H2 Maths, we’ll learn only about one (important) criterion for deciding whether an estimator is “good”. This is unbiasedness. Informally, an estimator is unbiased if on average, the estimator “gets it right”. Formally: Definition 135. Let X be a random variable and θ ∈ R be a parameter (i.e. just some real number). We say that X is an unbiased estimator for θ if E [X] = θ.

If x is an estimate generated by an unbiased estimator X, then we call x an unbiased estimate. ¯ is an unbiased estimator for the The next proposition says that the sample mean X population mean µ; and the sample variance S 2 is an unbiased estimator for the population variance σ 2 . Proposition 15. Let (X1 , X2 , . . . , Xn ) be a random sample of size n drawn from a distri¯ be the sample mean and bution with population mean µ and population variance σ 2 . Let X 2 S be the sample variance. Then ¯ = µ. And (a) E [X] (b) E [S 2 ] = σ 2 .

Proof. You are asked to prove (a) in Exercise 259. For the proof of (b), see p. 926 in the Appendices (optional).

Proposition 15(b) is the reason why, strangely enough, we define the sample variance with n − 1 in the denominator: S = 2

¯ 2 + (X2 − X) ¯ 2 + ⋅ ⋅ ⋅ + (Xn − X) ¯ 2 (X1 − X) n−1

.

As defined, S 2 is an unbiased estimator for the population variance σ 2 . This, then, is the reason why we define it like this. Some writers call S 2 the unbiased sample variance, but we shall not bother doing so. We’ll simply call S 2 the sample variance. Page 684, Table of Contents

www.EconsPhDTutor.com

Example 591 (continued from above). (Chee Soon Juan election.) Suppose two observed random samples of size 3 are (x1 , x2 , x3 ) = (1, 0, 0) and (x1 , x2 , x3 ) = (1, 0, 1). The corresponding observed sample means are x¯1 = 1/3 and x¯2 = 2/3. These are two possible estimates (“guesses”) of the true sample proportion µ.

Unless we’re extremely lucky, it’s unlikely that either of these two estimates is exactly correct. Nonetheless, what the above unbiasedness proposition tells us is this: Suppose the unknown population mean is µ = 0.39. We draw the following 10 observed random samples of size 3 (table below). For each sample i, we calculate the corresponding observed sample mean x¯i . Sample i x1 x2 x3 1 1 0 1 2 0 0 0 0 1 0 3 4 1 0 0 5 0 1 1 6 1 0 0 7 0 0 0 8 0 0 0 0 0 1 9 10 1 1 0

x¯i 2/3 0 2/3 1/3 2/3 1/3 0 0 1/3 2/3

¯ i can only take on Note that every estimate x¯i is wrong. Indeed, since the sample mean X values 0, 1/3, 2/3, or 1, the estimates can never possibly be equal to the true µ = 0.39. Nonetheless, what the above proposition says informally is that on average, the estimate ¯ = µ = 0.39. gets it correct. Formally, E [X]

For a demonstration that you can play around with, try this Google spreadsheet.

Page 685, Table of Contents

www.EconsPhDTutor.com

Exercise 258. (Answer on p. 1110.) We are interested in the weight (in kg) of Singaporeans. We have an observed random sample of size 5: (32, 88, 67, 75, 56).

(a) Find unbiased estimates for the population mean µ and variance σ 2 of the weights of Singaporeans. (State any assumptions you make.) (b) What is the average weight of a Singaporean? ¯ = µ. (This is part (a) of Proposition 15). (Answer on Exercise 259. Prove that E [X] p. 1111.) Exercise 260. Suppose we flip a coin 10 times. The first 7 flips are heads and the next 3 are tails. Let 1 denote heads and 0 denote tails. (Answer on p. 1111.) (a) Write down, in formal notation, our observed random sample, the observed sample mean, and observed sample variance. (b) Are these observed sample mean and variance unbiased estimates for the true population mean and variance? (c) Can we conclude that this a biased coin (i.e. the true population mean is not 0.5)?

Page 686, Table of Contents

www.EconsPhDTutor.com

71.8

The Sample Mean is a Random Variable

¯ is itself a This section is just to repeat, stress, and emphasise that the sample mean X random variable. This is an important point. ¯ is both (i) a random variable; and (ii) an estimator. In Indeed, the sample mean X contrast, an observed sample mean x¯ is both (i) a real number; and (ii) an estimate. ¯ = µ. This equation can be interpreted in two equivalent ways: We’ve showed that E [X] • The expected value of the sample mean equals the population mean µ. • The sample mean is an unbiased estimator for the population mean µ.

We now give the variance of the sample mean. It turns out to be equal to the population variance σ 2 , divided by the sample size n. 2 ¯ =σ . Fact 82. V [X] n

Proof. You are asked to prove this fact in Exercise 261 .

¯ = 1 (X1 + X2 + ⋅ ⋅ ⋅ + Xn ) and X1 , X2 , Exercise 261. Prove Fact 82. (Hint: Note that X n . . . , Xn are independent.) (Answer on p. 1111.) Exercise 262. For each of the following terms, give a formal definition and an intuitive explanation. (State whether each term is a random variable or a real number.) For simplicity, you may assume that the finite population is given by P = (x1 , x2 , . . . , xk ). (Answer on p. 1112.) (a) The population mean. (b) The population variance. (c) The sample mean. (d) The sample variance. (e) The mean of the sample mean. (f) The variance of the sample mean. (g) The mean of the sample variance. (h) The observed sample mean. (i) The observed sample variance.

Page 687, Table of Contents

www.EconsPhDTutor.com

71.9

The Distribution of the Sample Mean

Fact 83. Let X1 , X2 , . . . , Xn ∼ N (µ, σ 2 ) be independent random variables. Then X1 + X2 + ⋅ ⋅ ⋅ + X n σ2 ¯ Xn = ∼ N (µ, ) . n n

Proof. Corollary 7 tells us that the sum of normal random variables is itself a normal random variable. So X1 + X2 + ⋅ ⋅ ⋅ + Xn is a normal random variable. Fact 79 tells us that a linear transformation of a normal random variable is itself a normal ¯ n = (X1 + X2 + ⋅ ⋅ ⋅ + Xn ) /n is a normal random variable. random variable. So X

¯ n has mean µ and variance σ 2 /n. In the previous sections, we already showed that X 2

¯ n ∼ N (µ, σ ). Altogether then, X n

Now, suppose instead X1 , X2 , . . . , Xn are not normally-distributed. Surprisingly, a similar result still holds, thanks to the CLT. Informally, draw X1 , X2 , . . . , Xn from any distribution. Then thanks to the CLT, it will still be the case that — provided n is “large enough” — ¯ n is (approximately) normally-distributed. Formally: X Fact 84. Let X1 , X2 , . . . , Xn be independent random variables, each identically-distributed with mean µ ∈ R and variance σ 2 ∈ R. Let

2

¯ n ∼ N (µ, σ ). Then lim X n→∞ n

¯ n = X 1 + X2 + ⋅ ⋅ ⋅ + X n . X n

Proof. The CLT says that if n is “large enough”, then X1 +X2 +⋅ ⋅ ⋅+Xn is well-approximated by the normal distribution N (nµ, nσ 2 ).

And so it follows from Fact 79 (a linear transformation of a normal random variable is itself ¯ = (X1 + X2 + ⋅ ⋅ ⋅ + Xn ) /n is well-approximated by the a normal random variable) that X σ2 normal distribution N (µ, ). n

In the next chapter, we’ll make greater use of the two results given in this section.

Page 688, Table of Contents

www.EconsPhDTutor.com

71.10

Non-Random Samples

Some examples to illustrate the concept of a non-random sample: Example 595. Suppose we’re interested in the average height of a Singaporean. The only way to know this for sure is to survey every single Singaporean. This, however, is not practical. Instead, we have only the resources to survey 100 individuals. We decide to go to a basketball court and measure the heights of 100 people there. We thereby gather an observed sample of size 100: (x1 , x2 , . . . , x100 ). We find that the average individual’s height is x¯ = ∑ xi /100 = 179 cm.

Is x¯ = 179 cm an unbiased estimate of the average Singaporean’s height? Intuitively, we know that the answer is obviously no.

The reason is that our observed sample of size 100 was non-random. We picked a basketball court, where the individuals are overwhelmingly (i) male; and (ii) taller than average. Our estimate x¯ = 179 cm is thus probably biased upwards.

Example 596. Suppose we’re interested in what the average Singaporean family spends on food each month. The only way to know this for sure is to survey every single family in Singapore. This, however, is not practical. Instead, we have only the resources to survey 100 families. We decide to go to Sixth Avenue and randomly ask 100 families living there what they reckon they spend on food each month. We thereby gather an observed sample of size 100: (x1 , x2 , . . . , x100 ). We find that the average family spends x¯ = ∑ xi /100 = $2, 700 on food each month.

Is x¯ = $2, 700 an unbiased estimate of the average monthly spending on food by a Singaporean family? Intuitively, we know that the answer is obviously no. The reason is that our observed sample of size 100 was non-random. We picked an unusually affluent neighbourhood. Our estimate x¯ = $2, 700 is thus probably biased upwards.

Page 689, Table of Contents

www.EconsPhDTutor.com

71.11

Stratified, Quota, and Systematic Sampling SYLLABUS ALERT

This is in the 9740 (old) syllabus, but not in the 9758 (revised) syllabus. So you can skip this section if you’re taking 9758. Here’s an example of stratified sampling: Example 597. We’re interested in the average height µ of a Singaporean. We decide to ¯ will be our estimator use a random sample of 100 individuals. The sample mean height X for µ. We could, as usual, simply randomly pick 100 individuals from the population. (This usual method is called simple random sampling, in distinction to the three new forms of sampling introduced in this section.) The problem with this is that we might be unlucky and get disproportionately many males or females. For example, we might get an observed random sample of 60 males and 40 females. Since males are taller than females, our observed sample mean height x¯ would probably be an overestimate of the true µ. To reduce such bad luck, we might first divide the population into 2 strata: Male and Female. Assume that 0.49 of the population is male and 0.51 of the population is female. Then within the Male stratum, we randomly pick 49 individuals; and within the Female stratum, we randomly pick 51 individuals. This is called stratified sampling. The sex ratio of our random sample is thus guaranteed (by design) to closely match that of the population. Informally, the advantage of stratified sampling is that it gives us a more representative sample. Mathematically, this means that the estimates generated will tend to be closer to the true parameter.81

81

More formally, stratified sampling reduces the variance of the estimator.

Page 690, Table of Contents

www.EconsPhDTutor.com

Quota sampling is very similar to stratified sampling. Again, we divide the population into strata. The difference is that instead of randomly picking respondents from each stratum, interviewers are free to choose the respondents from each stratum. Example 598. Again, we want to know the average height µ of a Singaporean. We decide ¯ will be our estimator to use a random sample of 100 individuals. The sample mean height X for µ. Again, we divide the population into 2 strata: Male and Female. Assume that 0.49 of the population is male and 0.51 of the population is female. In stratified sampling, we would have randomly picked 49 males and 51 females. In contrast, in quota sampling, the interviewer is given the freedom to choose any 49 males and any 51 females, in whatever way the interviewer deems appropriate. A small advantage of quota sampling is that it gives the interviewer more flexibility and can thus speed up the collection of data. The big disadvantage with quota sampling is that the interviewer may unwittingly introduce biases. For example, told simply to choose 49 males and 51 males, the interviewer might choose respondents who are more attractive, more friendly-looking, of the same race, etc. Here’s a historical example where quota sampling led to disaster: Example 599. The 1948 US Presidential Election featured Harry Truman (Democrat) vs Thomas Dewey (Republican). In the months leading to the election, polls were almost unanimous that there’d be a landslide victory for Dewey. They were wrong. The following is taken from p. 19 of a 1949 report analysing the 1948 polling disaster. Every poll gave Dewey a sizeable lead. Every poll was wrong. Dewey Truman T − D Gallup Poll 49.5% 44.5% −5.0% Crossley Poll 49.9% 44.8% −5.1% Roper Poll 52.2% 37.1% −15.1% Actual 45.1% 49.5% +4.4%

This polling disaster was immortalised by the following photograph of Truman holding up a copy of the Chicago Daily Tribune. On the morning following the election, the Tribune had decided to run with the frontpage headline “DEWEY DEFEATS TRUMAN”, even before knowing the official result. (... Example continued on the next page ...)

Page 691, Table of Contents

www.EconsPhDTutor.com

(... Example continued on the next page ...)

What explains this polling disaster? One explanation (among many — see cited report) was that the pollsters used quota sampling. “Interviewers were each given an assigned quota specifying the number of men, women, and persons of various economic levels to be interviewed.” It was believed this would reduce “the biases that might result if interviewers chose their respondents freely” (p. 12, id.) However, other than being subject to these quotas, interviewers had complete freedom to choose their respondents. They could thus operate as “biased” selecting devices (p. 84). For example, an interviewer who was himself likely to vote Dewey might tend (unconsciously or otherwise) to choose respondents who were themselves also likely to vote Dewey (p. 136ff). As another example, the interviewers were themselves likely to be upper-class whites (a demographic that tended to vote Dewey). Thus, lower-class and Negro respondents (demographics that tended to vote Truman) may have been less likely to tell the truth when responding to their interviewers (p. 135). Polling has improved since 1948. For example, pollsters have now moved away from quota sampling. Nonetheless, polling disasters still occur occasionally. Two recent examples: Page 692, Table of Contents

www.EconsPhDTutor.com

Example 600. The Brexit referendum of June 23, 2016 gave British voters two choices: “Remain” (in the EU) or “Leave”. In the week before the actual referendum, there were 12 polls, of which 9 gave “Remain” the lead, including one that gave “Remain” a commanding 55 − 45 lead! And the only 3 polls that had “Leave” in the lead gave “Leave” only a 1 or 2% lead. (Source: Wikipedia.)

Alas, these polls got it terribly wrong. On the day of the referendum itself, the “Leave” campaign won 52 − 48. (Note though that the pollsters, having learnt the lesson of 1948, probably didn’t use quota sampling. So the use of quota sampling was probably not one of explanations of why the Brexit polling got it wrong.) To be fair, pollsters were not the only ones who got it wrong. Even the financial markets had also expected “Remain” to win. This was dramatically illustrated by the plunge in the value of the British pound on the night of the elections, as the results came in:

US Dollar (USD) per British Pound (GBP).

Page 693, Table of Contents

www.EconsPhDTutor.com

Example 601. The 2016 US Presidential Elections on Nov 8, 2016 gave American voters two main choices: Hillary Clinton or Donald Trump. Almost all polls predicted a win for Hillary Clinton. Pivotal states, like Michigan, Pennsylvania, and Wisconsin were judged by many pollsters to be safe wins for Hillary. They were wrong. On the day before the election, British betting markets were still giving Donald Trump only about a 15% - 20% probability of winning. They were also wrong. Donald Trump’s victory was highly surprising. This surprise result was reflected in the plunge of the Mexican peso as results came in on the night of the election. At 8pm (New York time), 1 Mexican peso was still worth 5.477 US cents. Three hours later, at 11pm, the peso had fallen by 11% — 1 peso was now worth only 4.870 cents. Newsweek magazine printed two versions of its magazine. However, one of its distributors (a company by the name of Topix Media) was so confident that Hillary would win that it dispatched 125,000 copies of the Hillary version. These had to be recalled, to the embarrassment of Newsweek. Below is a photo of Hillary signing a copy of the version with “Madam President” on the cover.

This example was added in November 2016, a few days after the election. Professional pollsters are still scrambling to figure out why exactly they got it so wrong. One plausible reason is that “shy” Trump-supporters were reluctant to tell pollsters that they’d be voting for a misogynist, racist, and xenophobic rabble-rouser.

Page 694, Table of Contents

www.EconsPhDTutor.com

Here’s an example of systematic sampling: Example 602. We want to know the average weight µ of the 28, 311 undergraduate students enrolled at NUS. We decide to use a random sample of 100 individuals. The sample ¯ will be our estimator for µ. mean weight X We have a complete list of all the students. We order them alphabetically from student #1 to student #28, 311. We then get every 250th student on the list, until we have 100 individuals. That is, we get student #250, student #500, student #750, . . . , and finally student #25, 000. This is called systematic sampling. The advantage of systematic sampling is that it takes away the freedom of interviewers to choose their respondents. The sample is thus more likely to be truly random. The disadvantage of systematic sampling is that it may, by coincidence, introduce a large and systematic bias. Here’s an example. Example 603. We are interested in the average household spending on groceries in a particular HDB estate. We decide to use systematic sampling: We’ll survey every flat whose unit number ends in 7. This, we believe, is surely random. This seems like a great idea. But unfortunately, it turns out that in this particular HDB estate, units whose numbers end in 7 are all 5-room flats. And every other unit is a 4-room flat. We are completely unaware of this. As a result, we’ll probably obtain an overestimate of the true average household spending in this HDB estate.

Page 695, Table of Contents

www.EconsPhDTutor.com

72

Null Hypothesis Significance Testing (NHST)

Here’s a quick sketch of how Null Hypothesis Significance Testing (NHST) works: Example 604. A piece of equipment has probability θ of breaking down. We have many pieces of the same type of equipment. Assume the rates of breakdown across the pieces of equipment are identical and independent. 1. Write down a null hypothesis H0 . In this case, it might be “H0 : θ = 0.6”.

2. Write down an alternative hypothesis HA . In this case, it might be “HA : θ < 0.6”.

(This is a one-tailed test — to be explained shortly.)

3. Observe a random sample. For example, we might have an observed random sample of size 5, where only the fourth piece of equipment breaks down. And so we’d write (x1 , x2 , x3 , x4 , x5 ) = (0, 0, 0, 1, 0). 4. Write down a test statistic. In this case, an obvious test statistic is the sample number of failures T = X1 + X2 + X3 + X4 + X5 . Our observed test statistic is thus t = x1 + x2 + x3 + x4 + x5 = 0 + 0 + 0 + 1 + 0 = 1.

5. Now ask, how likely is it that — if H0 were true — our test statistic would have been “at least as extreme as” that actually observed? That is, what is the probability P (Observe data as extreme as that observed∣H0 )?

The above probability is called the p-value of the observed sample.

In this case, the p-value is the probability of observing a random sample where 1 or fewer pieces of equipment broke down, assuming H0 ∶ θ = 0.6 were true. That is, p = P (T ≤ t = 1∣H0 ) .

Now, remember that T is a random variable. In fact, it’s a binomial random variable. Assuming H0 to be true, we have T ∼ B (n, θ) = B (5, 0.6). Thus, p = P (T ≤ 1∣H0 ) = P (T = 0∣H0 ) + P (T = 1∣H0 ) =

⎛5⎞ 0 5 ⎛5⎞ 1 4 0.6 0.4 + 0.6 0.4 = 0.08704. ⎝0⎠ ⎝1⎠

This says that if H0 were true, then the probability of observing a test statistic as extreme as the one we actually observed is only 0.08704. We might interpret this relatively small p-value as casting doubt on or providing evidence against H0 . Page 696, Table of Contents

www.EconsPhDTutor.com

Here is the full list of the ingredients that go into NHST. Null Hypothesis Significance Testing (NHST) 1. Null hypothesis H0 (e.g. “this equipment has probability 0.6 of breaking down”). 2. Alternative hypothesis HA (e.g. “this equipment has probability less than 0.6 of breaking down”). The test is either one-tailed or two-tailed, depending on HA . 3. A random sample of size n: (X1 , X2 , . . . , Xn ).

4. A test statistic T (which simply maps each observed random sample to a real number.) 5. The p-value of the observed sample. This is the probability that — assuming H0 were true — T takes on values that are at least “as extreme as” the actual observed test statistic t. 6. The significance level α. This is a pre-selected threshold, usually chosen to be some small value. The conventional significance levels are α = 0.1, α = 0.05, or α = 0.01. We then conclude qualitatively that:

• A small p-value casts doubt on or provides evidence against H0 . • A large p-value fails to cast doubt on or provide evidence against H0 . In particular, if p < α, then we say that we reject H0 at the significance level α. And if p ≥ α, then we say that we fail to reject H0 at the significance level α. Note importantly that to reject H0 (at some significance level α) does NOT mean that H0 is false and HA is true. Similarly, failure to reject H0 does NOT mean that H0 is true and HA is false. More on this below. Another example of NHST, now slightly more formally and carefully presented.

Page 697, Table of Contents

www.EconsPhDTutor.com

Example 672. (Dr. Chee election example.) Our parameter of interest is µ, the proportion of votes for Dr. Chee. We guess that Dr. Chee won only 30% of the votes. We might thus write down two competing hypotheses: H0 ∶ µ = 0.3, HA ∶ µ > 0.3.

We call H0 the null hypothesis and HA the alternative hypothesis. We pre-select α = 0.05 as our significance level. This is the arbitrary threshold at which we’ll say we reject (or fail to reject) H0 .

We gather a random sample of 100 votes: (X1 , X2 , . . . , X100 ). Our test statistic is the number of votes in favour of Dr. Chee, given by T = X1 + X2 + ⋅ ⋅ ⋅ + X100 .

Suppose that in our observed random sample (x1 , x2 , . . . , x100 ), we find that 39 are in favour of Dr. Chee. Our observed test statistic is thus t = 39.

We now ask: What is the probability that — assuming H0 were true — T takes on values that are at least “as extreme as” the actual observed test statistic t? That is, what is the p-value of the observed sample? Now, assuming H0 were true, T is a binomial random variable with parameters 100 and 0.3. That is, T ∼ B (n, p) = B (100, 0.3). So: =

p = P (T ≥ 39∣H0 ) = P (T = 39∣H0 ) + P (T = 40∣H0 ) + ⋅ ⋅ ⋅ + P (T = 100∣H0 )

⎛ 100 ⎞ 39 61 ⎛ 100 ⎞ 40 60 ⎛ 100 ⎞ 100 0 0.3 0.7 + 0.3 0.7 + ⋅ ⋅ ⋅ + 0.3 0.7 ≈ 0.03398. ⎝ 39 ⎠ ⎝ 40 ⎠ ⎝ 100 ⎠

The small p-value casts doubt on or provides evidence against H0 .

And since p ≈ 0.03398 < α = 0.05, we can also say that we reject H0 at the α = 0.05 significance level.

Page 698, Table of Contents

www.EconsPhDTutor.com

Let θ be the parameter we’re interested in. Under the objectivist interpretation, the value of θ may be unknown, but it is fixed. This has two consequences: 1. We never speak probabilistically about θ, because θ is a fixed number. For example, we never say “θ is probably less than 0.6” or “θ has probability 0.8 of being between 0.4 and 0.7”. Such statements are nonsensical. 2. The null hypothesis, which is always written as an equality (e.g. “H0 ∶ θ = 0.6”), is almost certainly false. After all, θ can (usually) take on a continuum of values. So do NOT interpret “we fail to reject H0 ” to mean “H0 is true”. This is because H0 is almost certainly false. When performing NHST, we will assiduously avoid saying things like “H0 is true”, “H0 is false”, “HA is true”, or “HA is false”. Instead, we will stick strictly to saying either “we reject H0 at the significance level α” or “we fail to reject H0 at the significance level α”. Each of these two statements has a very precise meaning. The first says that p < α. The second says that p ≥ α. Nothing more and nothing less. Exercise 263. We flip a coin 20 times and get 17 heads. Test, at the 5% significance level, whether the coin is biased towards heads. (Answer on p. 1113.)

Page 699, Table of Contents

www.EconsPhDTutor.com

72.1

One-Tailed vs Two-Tailed Tests

In the previous section, all the NHST we did were one-tailed tests.82 For example, in the NHST done for Dr. Chee, we had H0 ∶ µ = 0.3, HA ∶ µ > 0.3.

This was a one-tailed test because the alternative hypothesis HA was that µ was to the right of 0.3. If instead we changed the alternative hypothesis to: H0 ∶ µ = 0.3, HA ∶ µ ≠ 0.3.

Then this would be called a two-tailed test, because the alternative hypothesis HA is that µ is either to the left or to the right of 0.3. We now repeat the examples done in the previous section, but with HA tweaked so that we instead have two-tailed tests. The difference is that the p-value is calculated differently.

82

By the way, the more common convention is to say “one-tailed” and “two-tailed” tests, rather than “one-tail” and “twotail” tests, as is the norm in Singapore (similar to those “Close for break” signs you sometimes see). But after some consultation with my grammatical experts, I have been told that both are equally correct.

Page 700, Table of Contents

www.EconsPhDTutor.com

Example 604 (equipment breakdown). Everything is as before, except that we now change the alternative hypothesis: H0 ∶ θ = 0.6, HA ∶ θ ≠ 0.6.

Say we observe the same random sample as before: (x1 , x2 , x3 , x4 , x5 ) = (0, 0, 0, 1, 0).

Again our test statistic is the sample number of failures T = X1 + X2 + X3 + X4 + X5 . And so again our observed test statistic is t = x1 + x2 + x3 + x4 + x5 = 0 + 0 + 0 + 1 + 0 = 1. The difference now is how the p-value (of the observed sample) is calculated. In words, the p-value gives the likelihood that our test statistic is “at least as extreme as” that actually observed — assuming H0 were true.

Previously, under a one-tailed test, we interpreted “our test statistic is at least as extreme as that actually observed” to mean the event T ≤ t = 1.

Now that we’re doing a two-tailed test, we’ll instead interpret the same phrase to mean both the event T ≤ t = 1 and the event that T is as far away on the other side of E [T ∣H0 ] = 3. The second event is, specifically, T ≥ 5. Altogether then, the p-value is given by p = P (T ≤ 1, T ≥ 5∣H0 )

= P (T = 0∣H0 ) + P (T = 1∣H0 ) + P (T = 5∣H0 )

=

⎛5⎞ 0 5 ⎛5⎞ 1 4 ⎛5⎞ 1 4 0.6 0.4 + 0.6 0.4 + 0.6 0.4 = 0.1648. ⎝0⎠ ⎝1⎠ ⎝5⎠

Since p = 0.1648 ≥ α = 0.1, we say that we fail to reject H0 at the α = 0.1 significance level.

Observe that previously, under the one-tailed test, we could reject H0 at the α = 0.1 significance level, because there p = 0.08704. Now, in contrast, under the two-tailed test, we fail to reject H0 at the same significance level. In general, all else equal, the p-value for an observed random sample is greater under a two-tailed test than under a one-tailed test. Thus, under a two-tailed test, we are less likely to reject H0 .

Page 701, Table of Contents

www.EconsPhDTutor.com

Example 672 (Dr. Chee election). We change the alternative hypothesis: H0 ∶µ = 0.3, HA ∶µ ≠ 0.3.

Say we observe the same random sample as before: (x1 , x2 , . . . , x100 ), in which 39 votes were in favour of Dr. Chee. So again our observed test statistic is t = x1 + x2 + ⋅ ⋅ ⋅ + x100 = 39. The difference now is how the p-value (of the observed sample) is calculated. In words, the p-value gives the likelihood that our test statistic is “at least as extreme as” that actually observed — assuming H0 were true.

Previously, under a one-tailed test, we interpreted “our test statistic is at least as extreme as that actually observed” to mean the event T ≥ t = 39.

Now that we’re doing a two-tailed test, we’ll instead interpret the same phrase to mean both the event T ≥ t = 39 and the event that T is as far away on the other side of E [T ∣H0 ] = 30. The second event is, specifically, T ≤ 21. Altogether then, the p-value is given by p = P (T ≤ 21, T ≥ 39∣H0 ) = 1 − P (22 ≤ T ≤ 38∣H0 )

= 1 − [P (T = 22∣H0 ) + P (T = 23∣H0 ) + ⋅ ⋅ ⋅ + P (T = 38∣H0 )] ⎡ ⎤ ⎢⎛ 100 ⎞ 22 78 ⎛ 100 ⎞ 23 77 ⎛ 100 ⎞ 38 62 ⎥⎥ ⎢ =1−⎢ 0.3 0.7 + 0.3 0.7 + ⋅ ⋅ ⋅ + 0.3 0.7 ⎥ ≈ 0.06281. ⎝ 23 ⎠ ⎝ 38 ⎠ ⎢⎝ 22 ⎠ ⎥ ⎦ ⎣

Since p = 0.06281 ≥ α = 0.05, we say that we fail to reject H0 at the α = 0.05 significance level.

Again observe that previously, under the one-tailed test, we could reject H0 at the α = 0.05 significance level, because there p = 0.03398. Now, in contrast, under the two-tailed test, we fail to reject H0 at the same significance level.

Exercise 264. We flip a coin 20 times and get 17 heads. Test, at the 5% significance level, whether the coin is biased.(Answer on p. 1113.)

Page 702, Table of Contents

www.EconsPhDTutor.com

72.2

The Abuse of NHST (Optional)

NHST is popular because it gives a simplistic, formulaic cookbook procedure. Moreover, its conclusion appears to be binary: either we reject H0 or we fail to reject H0 . However, NHST is widely misunderstood, misinterpreted, and misused even within scientific communities. It has long been heavily criticised. In March 2016, the American Statistical Association even issued an official policy statement on how NHST should be used! Here I discuss only the most important, commonly-made error. We may write the p-value as p = P (D∣H0 ) ,

where D stands for the observed data and H0 stands for the null hypothesis. The p-value answers the following question: — assuming H0 were true, what’s the probability that we’d get data “at least as extreme” as those actually observed (D)? Say we get a p-value of 0.03. We should then say simply that • The small p-value casts doubt on or provides evidence against H0 .

• If the pre-selected significance level was α = 0.05, then we may say that we reject H0 at the 5% significance level. However, instead of merely saying the above, some researchers may instead conclude that: H0 is true with probability 0.03. Do you see the error here? The researcher has gone from the finding that p = P (D∣H0 ) = 0.03 to the conclusion that P (H0 ∣D) = 0.03. This is precisely the Conditional Probability Fallacy (CPF), which we discussed at length in subsection 56.1. The error is the same as leaping from “A lottery ticket buyer who doesn’t cheat has a small probability q of winning” to “Jane bought a lottery ticket and won. Therefore, there is only probability q that she didn’t cheat.” The p-value is NOT the probability that H0 is true.83 Instead, it is the probability that — assuming H0 were true — we would have gotten data “at least as extreme” as those actually observed. This is an important difference. But it is also a subtle one, which is why even researchers get confused.

83

Indeed, under the objectivist view, such a statement is nonsensical anyway, because H0 is either true or not true; it makes no sense to talk probabilistically about whether H0 is true.

Page 703, Table of Contents

www.EconsPhDTutor.com

72.3

Common Misinterpretations of the Margin of Error (Optional)

The sampling error or margin of error is often misinterpreted by laypersons (and journalists). Example 605. On the night of the 2016 Bukit Batok SMC By-Election, the Elections Department announced* that based on a sample count of 900 ballots, • Dr. Chee had won 39% of the votes.

• These sample counts have a confidence level of 95%, with a ±4% margin of error.

What does the above gobbledygook mean? Let µ be the true proportion of votes won by ¯ be the sample proportion and x¯ be the observed sample proportion. Dr. Chee. Let X It’s clear enough what the 39% means — they randomly counted 900 ballots and found (after accounting for any spoilt votes) that x¯ = 39% were in favour of Dr. Chee. What’s less clear is what the 95% confidence level and ±4% margin of error mean. Here are three possible interpretations of what is meant. Only one is correct. 1. “With probability 0.95, µ ∈ (¯ x − 0.04, x¯ + 0.04) = (0.35, 0.43).”

¯ ∈ (¯ 2. “With probability 0.95, X x − 0.04, x¯ + 0.04) = (0.35, 0.43).”

Equivalently, suppose we repeatedly observe many random samples of size 900. Then we should find that in 0.95 of these observed random samples, the observed sample mean is between 0.35 and 0.43. ¯ ∈ (µ − 0.04, µ + 0.04).” 3. “With probability 0.95, X

We have no idea what µ is. All we can say is that with probability 0.95, the sample mean ¯ of votes for Dr. Chee is between µ − 0.04 and µ + 0.04. X

Equivalently, suppose we repeatedly observe many random samples of size 900. Then we should find that in 0.95 of these observed random samples, the observed sample mean is between µ − 0.04 and µ + 0.04. Take a moment to understand what each of the above interpretations say. Then decide which you think is the correct interpretation, before turning to the next page. (... Example continued on the next page ...) *Sources: The Straits Times [backup], TodayOnline [backup].

Page 704, Table of Contents

www.EconsPhDTutor.com

(... Example from the previous page ...) Interpretation #1 — “with probability 0.95, µ ∈ (¯ x − 0.04, x¯ + 0.04) = (0.35, 0.43)” — is perhaps the one most commonly made by laypersons.* It makes two errors:

1. It is nonsensical to speak probabilistically about the proportion µ of votes won by Dr. Chee. µ is some fixed number. So either µ is in the interval (0.35, 0.43), or it isn’t. It makes no sense to speak probabilistically about whether µ is in that interval.

2. The margin of error is applicable to the true proportion µ and not to the observed sample proportion x¯ = 0.39.

Some “authorities” often attempt** to correct Interpretation #1 by offering Interpretation ¯ ∈ (¯ #2 — “with probability 0.95, X x − 0.04, x¯ + 0.04) = (0.35, 0.43)”. However, Interpretation #2 is still wrong, because it still makes the second of the above two errors.

Unfortunately, the correct interpretation is also the one that says the least. It is Interpre¯ ∈ (µ − 0.04, µ + 0.04)”. tation #3 — “with probability 0.95, X

This interpretation says merely that if we were somehow able to repeatedly observe random samples of size 900, then we’d find that 0.95 of the corresponding observed sample means will be in (µ − 0.04, µ + 0.04). Which isn’t saying much, because first of all, we have only one observed random sample; we do not get to repeatedly observe random samples. Secondly, this still doesn’t tell us much about µ, which is what we’re really interested in. The correct interpretation (Interpretation #3) is the least interesting interpretation. Perhaps this explains why journalists often prefer to give an incorrect interpretation.

*E.g. the article “Margin of Ignorance” (backup) begins by reporting poll results that Kerry-Edwards was supported by 51% of voters, while Bush-Cheney was supported by 45%. The author then ridicules other journalists for their misinterpretation of these data. (He also claims, incorrectly, that polling is based on the Central Limit Theorem.) He then triumphantly gives the “correct” explanation: “95 times out of 100 the true Kerry-Edwards number will fall between 47 and 55 and the Bush-Cheney number will fall between 41 and 49.” This, of course, is what we called incorrect Interpretation #1 above. **Section 3 of “Erring on the Margin of Error” lists some such mistakes.

See section 85.9 in the Appendices for a discussion of where the Elections Department’s ±4% margin of error comes from.

Page 705, Table of Contents

www.EconsPhDTutor.com

Journalists often try to explain what the confidence level and margin of error mean — they almost always get it wrong. Example 606. On the night of the 2016 Bukit Batok SMC By-Election, a website called Mothership.sg wrote: “Based on the sample count of 100 votes,* it was revealed at 9.26pm that the SDP Sec-Gen received 39 percent of votes. In other words, Chee would score 35 per cent in the worst case scenario and 43 per cent in the best case scenario.” This is the most absurd misinterpretation of the margin of error I have ever seen.** Let’s see what the correct worst- and best-case scenarios are. Suppose that in the observed random sample of 900 votes, exactly 39% or 0.39 × 900 = 351 were votes for Dr. Chee and the remaining 549 were for PAP Guy. Then: • Worst-case scenario: The observed random sample of 900 votes happened to contain exactly all of the votes in favour of Dr. Chee. That is, Dr. Chee won only 351 votes and PAP Guy won the remaining 23, 570 − 351 = 23, 219 votes. So the correct worst-case scenario is that Dr. Chee won ≈ 1.5% of the votes. • Best-case scenario: The observed random sample of 900 votes happened to contain exactly all of the votes in favour of PAP Guy. That is, PAP Guy won only 549 votes and Dr. Chee won the remaining 23570 − 549 = 23, 021 votes. So the correct best-case scenario is that Dr. Chee won ≈ 97.7% of the votes. These worst- and best-case scenarios are admittedly unlikely. Nonetheless, they are possible scenarios all the same. The journalist’s purported worst- and best-case scenarios are completely wrong. *By the way, even this basic fact was wrong. The sample count was not 100 votes. Instead, it was 900 votes, consisting of 100 votes from each of 9 polling stations. Moreover, the Mothership.sg journalist failed to report the confidence level of 95%, either because he didn’t know what it meant or because he didn’t think it important. But it is important. It is pointless to inform the reader about the margin of error without also specifying the confidence level. **You can find several misinterpretations of the margin of error collected in this academic paper: “Erring in the Margin of Error”. None is as absurdly bad as the one here.

Page 706, Table of Contents

www.EconsPhDTutor.com

72.4

Critical Region and Critical Value SYLLABUS ALERT

This is new to the 9758 (revised) syllabus. So skip this section if you’re taking 9740 (old). Informally, the critical region is the set of values of the observed test statistic t for which we would reject the null hypothesis. The critical region is thus sometimes also called the rejection region. And the critical value(s) is (are) the exact value(s) of the observed test statistic t at which we are just able to reject the null hypothesis. Example 672. (Dr. Chee election.) Say that as before, we have a one-tailed test where the two competing hypotheses are: H0 ∶ µ = 0.3, HA ∶ µ > 0.3.

Say that as before, we choose α = 0.05 as our significance level.

Say that as before, in our observed random sample of 100 votes, 39 are in favour of Dr. Chee, so that our observed test statistic is t = 39.

We calculated that the corresponding p-value is 0.03398 and so we were able to reject H0 at the α = 0.05 significance level.

We now calculate the critical region and the critical value. We can calculate that if t = 38, then the corresponding p-value is ≈ 0.053 (you should verify this for yourself). And so we would be unable to reject H0 .

We thus conclude that the critical value is 39, because this is the value of t at which we are just able to reject H0 . And the critical region is the set {39, 40, 41, . . . , 100}. These are the values at which we’d be able to reject H0 at the α = 0.05 significance level.

Page 707, Table of Contents

www.EconsPhDTutor.com

Same example as above, but now two-tailed: Example 672. (Dr. Chee election.) Say that as before, we have a two-tailed test where the two competing hypotheses are: H0 ∶ µ = 0.3, HA ∶ µ ≠ 0.3.

The significance level is again α = 0.05. Again, the observed random sample of 100 votes contains 39 in favour of Dr. Chee, so that our observed test statistic is t = 39. We calculated that the corresponding p-value is 0.06281 and so we failed to reject H0 at the α = 0.05 significance level. We calculate that if t = 40, then the corresponding p-value is ≈ 0.03745 (you should verify this for yourself). Thus, the critical values are 20 and 40, because these are the values of t at which we are just able to reject H0 .

The critical region is the set {0, 1, . . . , 20, 40, 41, . . . , 100}. These are the values at which we’d be able to reject H0 at the α = 0.05 significance level. Exercise 265. (Answer on p. 1114.) We flip a coin 20 times. What are the critical region and critical value(s) in (a) A test, at the 5% significance level, of whether the coin is biased towards heads. (b) A test, at the 5% significance level, of whether the coin is biased.

Page 708, Table of Contents

www.EconsPhDTutor.com

72.5 Testing of a Population Mean (Small Sample, Normal Distribution, σ 2 Known) Example 607. The weight (in mg) of a grain of sand is X ∼ N (µ, 9). Our unknown parameter of interest is the true population mean µ (i.e. the true average weight of a grain of sand). Our “guess” is that µ = 5. We thus write down two competing hypotheses:

(Note that this is a two-sided test.)

H0 ∶ µ = 5, HA ∶ µ ≠ 5.

We take a random sample of size 4 — (X1 , X2 , X3 , X4 ). Our test statistic is the sample ¯ = (X1 + X2 + X3 + X4 ) /4. mean X

Our observed random sample is (x1 , x2 , x3 , x4 ) = (3, 9, 11, 7). That is, we randomly pick four grains of sand that happen to have weights 3, 9, 11, and 7 mg. Then the observed test statistic is x¯ =

3 + 9 + 11 + 7 = 7.5. 4

¯ takes on values “at least as extreme The p-value is the probability that the test statistic X as” our observed test statistic x¯ = 7.5, assuming H0 ∶ µ = 5 were true. Note that if H0 were ¯ ∼ N (µ, σ 2 /n) = N (5, 9/4). Thus, the p-value is given by true, then X ¯ ≥ 7.5, X ¯ ≤ 2.5∣H0 ) = P (X ¯ ≥ 7.5∣H0 ) + P (X ¯ ≤ 2.5∣H0 ) p = P (X

⎛ ⎛ 7.5 − 5 ⎞ 2.5 − 5 ⎞ =P Z≥ √ +P Z ≤ √ ≈ 0.04779 + 0.04779 = 0.09558. ⎝ ⎠ ⎝ ⎠ 9/4 9/4

Thus, we reject H0 at the α = 0.1 significance level. However, we would fail to reject H0 at the α = 0.05 significance level.

Page 709, Table of Contents

www.EconsPhDTutor.com

The table below summarises the tests to use for the population mean, in different circumstances. In this section, we learnt how to handle the first case (any sample size, normal distribution, σ 2 known). The following sections will deal with the other three cases. Sample size Distribution

σ2

σ 2 known

Any

Normal

Known

Z-test:

Large

Any

Known

Z-test:

Large

Any

Small

Normal

Unknown

Small

Non-normal

Either

Unknown Z-test:

t-test:

¯ −µ X √ ∼ N(0, 1). σ/ n ¯ −µ X √ ∼ N(0, 1). σ/ n ¯ −µ X √ ∼ N(0, 1). s/ n

¯ −µ X √ ∼ N(0, 1).* s/ n

Not in A-levels.

*Not in revised (9758) syllabus.

Exercise 266. The Singapore daily high temperature (in °C) can be modelled by X ∼ N (µ, 8). Our unknown parameter of interest is the true population mean µ (i.e. the true average daily high temperature). Your friend guesses that µ = 34. You gather the following data on daily high temperatures, of 10 randomly-chosen days in 2015: (35, 35, 31, 32, 33, 34, 31, 34, 35, 34). Test your friend’s hypothesis, at the α = 0.05 significance level. (Be sure to write down your null and alternative hypotheses.) (Answer on p. 1115.)

Page 710, Table of Contents

www.EconsPhDTutor.com

72.6 Testing of a Population Mean (Large Sample, Any Distribution, σ 2 Known) We’ll recycle the same example from the previous section. Before, we knew that X was normally distributed. Now the big difference is that we have absolutely no idea what distribution X comes from! To compensate, we require also that our random sample is “large enough”, so that the CLT-approximation can be used. Example 608. The weight (in mg) of a grain of sand is X ∼ (µ, 9). (This says simply that X is distributed with mean µ and variance 9.) Our unknown parameter of interest is the true population mean µ (i.e. the true average weight of a grain of sand). Again, we “guess” that µ = 5. Again, we write down H0 ∶ µ = 5, HA ∶ µ ≠ 5.

(Note that this is, again, a two-sided test.)

This time, we’ll take a random sample of size 100 — (X1 , X2 , . . . , X100 ). Again, our test ¯ = (X1 + X2 + ⋅ ⋅ ⋅ + X100 ) /100. statistic is the sample mean X

Recall the magic of the CLT. Even if we have absolutely no idea what distribution X ¯ is normally distributed. So here, is drawn from, then provided n is sufficiently large, X ¯ has, approximately, since the sample is large (n = 100 ≥ 20), by the CLT, we know that X 2 the normal distribution N (µ, σ /n). So, if H0 were true, then we have, approximately, ¯ ∼ N (µ, σ 2 /n) = N (5, 9/100). X Say the observed test statistic we get is: x¯ =

x1 + x2 + ⋅ ⋅ ⋅ + x100 = 5.5. 100

(... Example continued on the next page ...)

Page 711, Table of Contents

www.EconsPhDTutor.com

(... Example continued from the previous page ...) ¯ takes on values “at least as Again, the p-value is the probability that our test statistic X extreme as” our observed test statistic x¯ = 5.6, assuming H0 ∶ µ = 5 were true. Thus, the p-value is given by ¯ ≥ 5.6, X ¯ ≤ 4.4∣H0 ) = P (X ¯ ≥ 5.6∣H0 ) + P (X ¯ ≤ 4.4∣H0 ) p = P (X

⎛ ⎛ 4.4 − 5 ⎞ 5.6 − µ 4.4 − µ 5.6 − 5 ⎞ CLT √ ) + P (Z ≤ √ )=P Z≥ √ +P Z ≤ √ ≈ P (Z ≥ σ/ n σ/ n ⎝ ⎝ 9/100 ⎠ 9/100 ⎠ = P (Z ≥ 2) + P (Z ≤ −2) ≈ 0.0455.

Thus, we reject H0 at the α = 0.05 significance level. Exercise 267. The Singapore daily high temperature (in °C) can be modelled by X ∼ (µ, 8). Our unknown parameter of interest is the true population mean µ (i.e. the true average daily high temperature). Your friend guesses that µ = 34. You gather the data on daily high temperatures, of 100 randomly-chosen days in 2015 and find the observed sample average temperature to be 33.4 °C. Test your friend’s hypothesis, at the α = 0.05 significance level. (Be sure to write down your null and alternative hypotheses. Also, clearly state where you use the CLT.) (Answer on p. 1115.)

Page 712, Table of Contents

www.EconsPhDTutor.com

72.7 Testing of a Population Mean (Large Sample, Any Distribution, σ 2 Unknown) We’ll recycle the same example from the previous section. Again, we have absolutely no idea what distribution X comes from. And again, the random sample is large enough, so that the CLT can be used. But now, σ 2 is unknown. This turns out to be no big deal. We can simply replace σ 2 with the observed unbiased sample variance s2 , and do the same thing as before. Example 609. The weight (in mg) of a grain of sand is X ∼ (µ, σ 2 ). (This says simply that X is distributed with mean µ and variance σ 2 .) Our unknown parameter of interest is the true population mean µ (i.e. the true average weight of a grain of sand). Again, we “guess” that µ = 5. Again, we write down H0 ∶ µ = 5, HA ∶ µ ≠ 5.

(Note that this is, again, a two-sided test.)

Again, we take a random sample of size 100 — (X1 , X2 , . . . , X100 ). Again, our test statistic ¯ = (X1 + X2 + ⋅ ⋅ ⋅ + X100 ) /100. is the sample mean X

¯ has, approximately, Again, since the sample is large (n = 100 ≥ 20), by the CLT, that X 2 the normal distribution N (µ, σ /n). So, if H0 were true, then we have, approximately, ¯ ∼ N (µ, σ 2 /n) = N (5, σ 2 /100). Since the sample variance S 2 is an unbiased estimator for X ¯ ∼ N (µ, σ 2 /n) = N (5, s2 /100), where σ 2 , it is plausible that we also have, approximately, X s2 is the observed sample variance. Say the observed sample mean and observed sample variance we get are:

100 x1 + x2 + ⋅ ⋅ ⋅ + x100 ∑i=1 (xi − x¯) 2 = 5.6 and s = =8 x¯ = 100 n−1 2

(... Example continued on the next page ...)

Page 713, Table of Contents

www.EconsPhDTutor.com

(... Example continued from the previous page ...) ¯ takes on values “at least as Again, the p-value is the probability that our test statistic X extreme as” our observed test statistic x¯ = 5.6, assuming H0 ∶ µ = 5 were true. Thus, the p-value is given by ¯ ≥ 5.6, X ¯ ≤ 4.4∣H0 ) = P (X ¯ ≥ 5.6∣H0 ) + P (X ¯ ≤ 4.4∣H0 ) p = P (X

⎛ ⎛ 4.4 − 5 ⎞ 5.6 − µ 4.4 − µ 5.6 − 5 ⎞ CLT +P Z ≤ √ ≈ P (Z ≥ √ ) + P (Z ≤ √ ) = P Z ≥ √ s/ n s/ n ⎝ ⎝ 8/100 ⎠ 8/100 ⎠ ≈ P (Z ≥ 2.1213) + P (Z ≤ −2.1213) ≈ 0.03389.

Thus, we reject H0 at the α = 0.05 significance level. Exercise 268. The Singapore daily high temperature (in °C) can be modelled by X ∼ (µ, σ 2 ). Our unknown parameter of interest is the true population mean µ (i.e. the true average daily high temperature). Your friend guesses that µ = 34. You gather the data on daily high temperatures, of 100 randomly-chosen days in 2015. Your observed sample mean temperature is 33.4 °C and your observed sample variance is 11.2 °C2 . Test your friend’s hypothesis, at the α = 0.05 significance level. (Be sure to write down your null and alternative hypotheses. Also, clearly state where you use the CLT.) (Answer on p. 1116.)

Page 714, Table of Contents

www.EconsPhDTutor.com

72.8 Testing of a Population Mean (Small Sample, Normal Distribution, σ 2 Unknown) SYLLABUS ALERT This is in the 9740 (old) syllabus, but not in the 9758 (revised) syllabus. So you can skip this section if you’re taking 9758. We’ll recycle the same example from section 72.5. The big difference is that now σ 2 (the variance of X) is unknown: Example 610. The weight (in mg) of a grain of sand is X ∼ N (µ, σ 2 ). Our unknown parameter of interest is the true population mean µ (i.e. the true average weight of a grain of sand). Again, we “guess” that µ = 5. Again, we write down H0 ∶ µ = 5, HA ∶ µ ≠ 5.

This time, we have resources only to take a random sample of size 4 — (X1 , X2 , X3 , X4 ).

¯ = (X1 + X2 + X3 + X4 ) /4. Again, our test statistic is the sample mean X

Suppose our observed random sample is (x1 , x2 , x3 , x4 ) = (3, 9, 11, 7). That is, we randomly pick four grains of sand with weights 3, 9, 11, and 7 mg. Then the observed test statistic is x¯ =

3 + 9 + 11 + 7 = 7.5. 4

¯ ∼ N (µ, σ 2 /n) = N (5, σ 2 /4). It is however no good if we Note that if H0 were true, then X do not know what σ 2 is. Here’s an idea. Why not we use the unbiased sample variance s2 in place of σ 2 ? We can easily calculate s2 : (3 − 7.5) + (9 − 7.5) + (11 − 7.5) + (7 − 7.5) 35 ∑ (xi − x¯) s = = = . n−1 4−1 3 2

2

2

2

2

2

(... Example continued on the next page ...)

Page 715, Table of Contents

www.EconsPhDTutor.com

(... Example from the previous page ...) Wonderful. Can we now conclude, as we did in the previous section, that since the sample ¯ ∼ N (µ, s2 /n) = variance S 2 is an unbiased estimator for σ 2 , we have, approximately, X 35 N (5, /4)? Sadly, we cannot do so here in the small sample case. We cannot simply 3 replace the unknown true variance σ 2 with the unbiased sample variance s2 and say that we have a good approximate distribution. ¯ −µ X √ has Student’s t-distribution with s/ n n − 1 degrees of freedom (the proof of this fact is omitted from this textbook). Equiva¯ −µ X lently, we may write √ = Tn−1 . Just like with the normal distribution, we can use our s/ n graphing calculator to calculate probabilities associated with the t-distribution. (There are also tables we can use.) Instead, it turns out that the random variable

Tν is the random variable with Student’s t-distribution with ν degrees of freedom. Tν has a rather complicated PDF that we shall relegate to the Appendices (see Definition 172). All you need know is that • Tν has mean 0. • Its PDF looks very similar to that of the standard normal distribution, except it is shorter and fatter. • As ν → ∞, its PDF looks taller and thinner, until eventually it coincides exactly with Z.

Page 716, Table of Contents

www.EconsPhDTutor.com

Example 610 (continued from above). The p-value in this case is the probability ¯ took on values “at least as extreme as” our observed test statistic that our test statistic X x¯ = 7.5, assuming H0 ∶ µ = 5 were true. Thus, the p-value is given by ¯ ≥ 7.5, X ¯ ≤ 2.5∣H0 ) = P (X ¯ ≥ 7.5∣H0 ) + P (X ¯ ≤ 2.5∣H0 ) p = P (X

= P (Tn−1 ≥

⎛ ⎛ 2.5 − 5 ⎞ 2.5 − µ 7.5 − 5 ⎞ 7.5 − µ √ ) + P (Tn−1 ≤ √ ) = P T3 ≥ √ + P T3 ≤ √ s/ n s/ n ⎝ ⎝ (35/3)/4 ⎠ (35/3)/4 ⎠

≈ P (T3 ≥ 1.46385) + P (T3 ≤ −1.46385) ≈ 0.1197 + 0.1197 = 0.2394.

To calculate P (T3 ≥ 1.46385), follow these steps in your TI84:

1. Press the blue 2ND button and then VARS (which corresponds to the DISTR button). This brings up the DISTR menu. 2. Press 6 to select the “tcdf” option. The TI84 is now prompting you to enter the lower bound, the upper bound, and the degrees of freedom. Our lower bound is 1.46385, our upper bound is ∞, and our degrees of freedom are 3. So: 3. Enter the lower bound 1.46385 by pressing 1 . 4 6 3 8 5 . Then press , . Now enter the upper bound 1099 (in place of ∞) by pressing the blue 2ND button, EE (which corresponds to the , button), and then 9 9 . Now press , . Now enter the degrees of freedom by pressing 3 . Now simply press ENTER

Your TI84 tells us that P (T3 ≥ 1.46385) ≈ 0.119721312. After Step 1.

After Step 2.

After Step 3.

By symmetry, P (T3 ≤ −1.46385) ≈ 0.119721312.

Thus, we fail to reject H0 at the α = 0.1 significance level.

Observe that previously, when σ 2 was known, we had p = 0.09558 and we were able to reject H0 at the α = 0.1 significance level. Now in contrast, when σ 2 is unknown, the p-value is much larger (p ≈ 0.2394) and we are unable to reject H0 at the same significance level. (This observation holds in general. All else equal, it is harder to reject H0 when running a t-test than when running a Z-test.) Page 717, Table of Contents

www.EconsPhDTutor.com

Exercise 269. The Singapore daily high temperature (in °C) can be modelled by X ∼ N (µ, σ 2 ). Our unknown parameter of interest is the true population mean µ (i.e. the true average daily high temperature). Your friend guesses that µ = 34. You gather the following data on daily high temperatures, of 10 randomly-chosen days in 2015: (35, 35, 31, 32, 33, 34, 31, 34, 35, 34). Test your friend’s hypothesis, at the α = 0.05 significance level. (Be sure to write down your null and alternative hypotheses.) (Answer on p. 1117.)

Page 718, Table of Contents

www.EconsPhDTutor.com

72.9

Formulation of Hypotheses SYLLABUS ALERT

This is new to the 9758 (revised) syllabus. So skip this section if you’re taking 9740 (old). Example 611. We flip a coin 100 times. We get 100 heads. What can we say about the coin? This is an open-ended question, to which there can be many different answers. Here’s the answer we’re taught to give for H2 Maths: Let µ be the probability that a coin-flip is heads. We formulate a pair of competing hypotheses: H0 ∶ µ = 0.5, HA ∶ µ ≠ 0.5.

Our test statistic T is the number of heads (out of 100 coin-flips). Our observed test statistic t is 100. The corresponding p-value (note that this is a two-tailed test) is P (T ≥ 100, T ≤ 0∣H0 ) = P (T = 0∣H0 ) + P (T = 100∣H0 ) =

⎛ 100 ⎞ 0 100 ⎛ 100 ⎞ 100 0 0.5 0.5 + 0.5 0.5 ≈ 1.578 × 10−30 . ⎝ 0 ⎠ ⎝ 100 ⎠

The tiny p-value may be interpreted as casting on or providing evidence against H0 . We note also that we can easily reject H0 at any of the conventional significance levels (α = 0.1, α = 0.05, or α = 0.01). Exercise 270. (Answer on p. 1117.) We observe the weights (in kg) of a random sample of 50 Singaporeans: (x1 , x2 , . . . , x50 ). We observe that ∑ xi /50 = 68 and ∑ x2i /50 = 5000.

A friend claims that the average American is heavier than the average Singaporean. It is known that the average American weighs 75 kg. Is your friend correct? If you make any assumptions or approximations, make clear exactly where you do so. (Hint: Use Fact 81(a)).

Page 719, Table of Contents

www.EconsPhDTutor.com

73

Correlation and Linear Regression

73.1

Bivariate Data and Scatter Diagrams

In this chapter, we’ll be interested in the relationship between two sets of data. Example 612. We measure the heights and weights of 10 adult male Singaporeans. Their heights (in cm) and weights (in kg) are given in this table: i 1 2 3 4 5 6 7 8 9 10 hi (cm) 182 165 173 155 178 174 169 160 150 190 wi (kg) 81 70 71 53 72 75 69 60 44 80 We call (hi , wi ) observation i. So for example, observation 5 is (178, 72) and observation 9 is (150, 44).

We can plot a scatter diagram of these 10 persons’ weights (vertical axis) against their heights (horizontal).

90

Weight (kg)

80 70 60 50

Height (cm)

40 145

155

165

175

185

195

The black dotted line is called a line of best fit. Shortly (section 73.4), we’ll learn how to construct this line of best fit. The more closely the data points in the above scatter diagram lie to a straight line, the more strongly linearly-correlated are weight and height. So here with these particular data, the linear correlation between weight and height seems strong. In the next section, we’ll learn about the product moment correlation coefficient, which is a way to precisely quantify the degree to which two sets of data are linearly-correlated. Because the line of best fit is upward-sloping, we can also say that the linear correlation is positive.

Page 720, Table of Contents

www.EconsPhDTutor.com

Example 613. We have data from the Clementi weather station for the daily high temperature (in °C) and daily rainfall (in mm) on 361 days in 2015. (Strangely, data were missing for four days, namely Feb 10-13.) i 1 2 3 4 ... ti (°C) 27.3 29.5 31.1 32 pi (mm) 0 0.2 0 0

361 30.2 12.4

We can again plot a scatter diagram of rainfall against temperature.

80

Rainfall (mm)

70 60 50 40 30 20 10 0 25

30 Temperature (degrees Celsius) 35

Again, the black dotted line is a line of best fit. The data points do not seem close to this line. Thus, it seems that the linear correlation between temperature and rainfall is weak. The line of best fit is downward-sloping and so we say that the linear correlation is negative.

Exercise 271. (Answer on p. 1118.) The table below shows the prices charged (p) and the number of haircuts (q) given by 5 different barbers, during June 2016. Draw a scatter diagram with price on the horizontal axis. Plot also what you think looks like a line of best fit. i 1 2 3 4 5 pi ($) 8 9 4 10 8 300 250 1000 400 400 qi

Page 721, Table of Contents

www.EconsPhDTutor.com

73.2

Product Moment Correlation Coefficient (PMCC)

In the previous section, we used a scatter diagram to determine if there was a plausible linear relationship between two sets of data. This, though, was a very crude method. A more precise measure of the degree to which two sets of data are linearly correlated is called the product moment correlation coefficient (PMCC). Formally: Definition 136. Let (x1 , x2 , . . . , xn ) and (y1 , y2 , . . . , yn ) be two ordered sets of real numbers. The product moment correlation coefficient (PMCC) is the following real number: ∑i=1 (xi − x¯) (yi − y¯) r=√ . √ 2 2 n n ∑i=1 (xi − x¯) ∑i=1 (yi − y¯) n

Properties of the PMCC.

1. −1 ≤ r ≤ 1. (Surprisingly, this can be proven using vectors: Fact 105 in the Appendices.) 2. We say the linear correlation is positive if r > 0 and negative if r < 0. 3. If r = 1, the linear correlation is positive and perfect.

y

x 4. If r = −1, the linear correlation is negative and perfect.

y

x 5. If r is close to 1, the linear correlation is very strong.

y

x

Page 722, Table of Contents

www.EconsPhDTutor.com

6. If r is close to −1, the linear correlation is very strong.

y

x

7. If r is close to 0, the linear correlation is very weak.

y

x

y x

8. r is merely a measure of linear correlation and nothing else. Two variables may be very closely related but not linearly-correlated. For example, data generated by the quadratic model yi = x2i may have a very low r.

y

x

Page 723, Table of Contents

www.EconsPhDTutor.com

Example 612 (continued from above). This is the height and weight example revisited. For convenience, we reproduce the data and scatter diagram: 1 2 3 4 5 6 7 8 9 10 i hi (cm) 182 165 173 155 178 174 169 160 150 190 wi (kg) 81 70 71 53 72 75 69 60 44 80

90

Weight (kg)

80 70 60 50

Height (cm)

40 145

155

165

175

185

195

¯ = 182 + 165 + 173 + 155 + 178 + 174 + 169 + 160 + 150 + 190 = 169.6, h 10

w¯ =

81 + 70 + 71 + 53 + 72 + 75 + 69 + 60 + 44 + 80 = 67.5, 10

¯ (wi − w) ¯ (81 − w) ¯ (80 − w) ¯ = (182 − h) ¯ + ⋅ ⋅ ⋅ + (190 − h) ¯ = 1237 ∑ (hi − h) n

i=1

¿ √ Án 2 Á À∑ (hi − h) ¯ = (182 − 169.6)2 + ⋅ ⋅ ⋅ + (190 − 169.6)2 ≈ 37.180640, i=1

¿ √ Án 2 2 2 Á À∑ (wi − w) ¯ = (81 − 67.5) + + ⋅ ⋅ ⋅ + (80 − 67.5) ≈ 35.418922, i=1

¯ (wi − w) ¯ ∑i=1 (hi − h) Ô⇒ r = √ ≈ 0.9393. √ 2 2 n n ¯ ¯ ∑i=1 (hi − h) ∑i=1 (wi − w) n

As expected, r > 0 (the linear correlation is positive or, equivalently, the line of best fit is upward-sloping). Moreover, r is close to 1 (the linear correlation is very strong). Page 724, Table of Contents

www.EconsPhDTutor.com

Example 613 (continued from above). This is the temperature and rainfall example revisited. For convenience, we reproduce the data and scatter diagram: 1 2 3 4 ... i ti (°C) 27.3 29.5 31.1 32 0 0.2 0 0 pi (mm)

361 30.2 12.4

We can again plot a scatter diagram of rainfall against temperature.

80

Rainfall (mm)

70 60 50 40 30 20 10 0 30 Temperature (degrees Celsius) 35

25

27.3 + 29.5 + 31.1 + 32 + ⋅ ⋅ ⋅ + 30.2 t¯ = ≈ 31.5, 361

w¯ =

0 + 0.2 + 0 + 0 + ⋅ ⋅ ⋅ + 12.4 ≈ 5.0. 361

¯ ∑i=1 (ti − t¯) (wi − w) Ô⇒ r = √ √ 2 2 n n ¯ ∑i=1 (ti − t¯) ∑i=1 (wi − w) n

(27.3 − 31.5) (0 − 5.0) + ⋅ ⋅ ⋅ + (30.2 − 31.5) (12.4 − 5.0) =√ √ 2 2 2 2 (27.3 − 31.5) + ⋅ ⋅ ⋅ + (30.2 − 31.5) (0 − 5.0) + ⋅ ⋅ ⋅ + (12.4 − 5.0)

≈ −0.1623.

As expected, r < 0 (the linear correlation is negative or, equivalently, the line of best fit is downward-sloping). Moreover, r is fairly close to 0 (the linear correlation is weak). Page 725, Table of Contents

www.EconsPhDTutor.com

Exercise 272. Compute the PMCC between p and q, using the data below. (Answer on p. 1118.) i 1 2 3 4 5 pi ($) 8 9 4 10 8 qi 300 250 1000 400 400

Page 726, Table of Contents

www.EconsPhDTutor.com

73.3

Correlation Does Not Imply Causation (Optional)

Correlation does not imply causation. This saying has now become a cliché. Doesn’t make it any less true. Below is an amusing but spurious correlation (source). US spending on science, space, and technology correlates with

Suicides by hanging, strangulation and suffocation 2000

2001

2002

2003

2004

2005

2006

2007

2008

2009 10000 suicides

$25 billion

8000 suicides

$20 billion

6000 suicides

$15 billion

Hanging suicides

US spending on science

1999 $30 billion

4000 suicides 1999

2000

2001

2002

2003

2004

Hanging suicides

2005

2006

2007

2008

2009

US spending on science tylervigen.com

The PMCC is r ≈ 0.99789126. So the two sets of data are almost perfectly linearlycorrelated. But of course, this doesn’t mean that spending on science causes suicides or that suicides cause spending on science. More likely, the correlation is simply spurious. A comic from xkcd:

Page 727, Table of Contents

www.EconsPhDTutor.com

73.4

Linear Regression

Example 257 (continued from above). We suspect that the heights and weights of adult male Singaporeans are linearly-correlated. We thus write down this linear model: w = a + bh.

Recall the quote: “All models are wrong, but some are useful.” The model w = a + bh is unlikely to be exactly correct. But hopefully it will be useful.

We treat a and b as unknown parameters (do you expect b to be positive or negative?). Our goal is to try to get estimates for a and b, from an observed random sample of height and weight data. We recycle the data from earlier. These, along with the scatter diagram, are reproduced for convenience. 1 2 3 4 5 6 7 8 9 10 i hi (cm) 182 165 173 155 178 174 169 160 150 190 wi (kg) 81 70 71 53 72 75 69 60 44 80

90

Weight (kg)

80 70 60 50

Height (cm)

40 145

155

165

175

185

195

The basic idea of linear regression is this: Find the line that “best fits” the given data. Drawn in the figure above are three plausible candidates for the “line of best fit”. But there can only be one line of best fit. Which is it? At the end of the day, we’ll choose black dotted line as “the” line of best fit. But why? This will be answered in the next section.

Page 728, Table of Contents

www.EconsPhDTutor.com

Example 613 (continued from above). We suspect that daily rainfall and daily high temperatures for 2015 were linearly-correlated. We thus write down this linear model: p = a + bt.

Again, our goal is to get estimates for the unknown parameters a and b (do you expect b to be positive or negative?). We gather the following data (recycled from before): 1 2 3 4 ... i ti (°C) 27.3 29.5 31.1 32 0 0.2 0 0 pi (mm)

361 30.2 12.4

We can again plot a scatter diagram of rainfall against temperature.

80

Rainfall (mm)

70 60 50 40 30 20 10 0 25

30 Temperature (degrees Celsius) 35

Again, drawn in the figure above are several plausible candidates for the “line of best fit”. It turns out that the black dotted line will be “the” line of best fit.

Page 729, Table of Contents

www.EconsPhDTutor.com

73.5

Ordinary Least Squares (OLS)

There are different methods for determining “the” line of best fit. Each method will give a different line of best fit. The method we’ll learn in H2 Maths is the most basic and most standard method. It is called the method of ordinary least squares (OLS). Let’s assume there is some true linear model, which may be written as y = a+bx. As always, we stick to the objectivist interpretation. The parameters a and b have some true, fixed values. However, they are unknown (and may forever be unknown). Nonetheless, we’ll try to do our best and get estimates for a and b. These estimates will be denoted a ˆ and ˆb. And our line of best fit will then be y = a ˆ + ˆbx.

How do we find this line of best fit? Intuitively, this will be the line to which the data points are “as close as possible”. But there are many ways to define the term “as close as possible”. For example, we could try to minimise the sum of the distances between the points and the line. But we shall not do this. Instead, we’ll use the method of OLS: 1. Measure the vertical distance of each data point (xi , yi ) from the line. This is called the residual and is denoted uˆi . 2. Our goal is to find the line y = a ˆ + ˆbx that minimises ∑ uˆ2i — this quantity is called the Sum of Squared Residuals (SSR). Example:

Page 730, Table of Contents

www.EconsPhDTutor.com

Example 612 (height and weight example revisited). Our candidate line of best fit is w = a ˆ + ˆbh = 65+0h = 65. This is a horizontal line, which simply “predicts” that everyone’s weight is always 65 kg, regardless of their height. (This is a somewhat silly candidate line of best fit. Not surprisingly, this is not the actual line of best fit.)

85

Weight (kg)

80 75 5

70 65 60 55 50 45

Height (cm) 40 145

155

165

175

185

195

i 1 2 3 4 5 6 7 8 9 10 hi (cm) 182 165 173 155 178 174 169 160 150 190 wi (kg) 81 70 71 53 72 75 69 60 44 80 wˆi (kg) 65 65 65 65 65 65 65 65 65 65 uˆi = wi − wˆi (kg) 16 5 6 −12 7 10 4 −5 −21 15

The second last row of the above table gives, for each person with height hi , the corresponding predicted weight wˆi (as per our candidate line of best fit). The residual uˆi (last row) is then defined as the vertical distance between the data point and the weight predicted by the candidate line of best fit. 10

The SSR is ∑ uˆ2i = 162 + 52 + 62 + (−12)2 + 72 + 102 + 42 + (−5)2 + (−21)2 + 152 = 1317. i=1

Can we do better than this? That is, can we find another candidate line of best fit whose SSR is smaller than 1317?

Page 731, Table of Contents

www.EconsPhDTutor.com

The following fact gives two formulae for ˆb, the slope of the line of best fit. Formula (i) is printed in the List of Formulae you get during exams, but formula (ii) is not. Fact 85. Let (x1 , x2 , . . . , xn ) and (y1 , y2 , . . . , yn ) be two ordered sets of data. The OLS regression line of y on x is y − y¯ = ˆb (x − x¯), where ∑ (xi − x¯) (yi − y¯) (i) ˆb = i=1 n , 2 ∑i=1 (xi − x¯) n

xy¯ ∑ xi yi − n¯ . (ii) ˆb = x2 ∑ x2i − n¯

Moreover, the regression line can also be written in the form y = a ˆ + ˆbx, where ˆb is as given above and a ˆ = y¯ − ˆb¯ x.

Proof. We want to find a ˆ and ˆb such that the line y = a ˆ + ˆbx has the smallest SSR possible.

The residual uˆi is defined as the vertical distance between (xi , yi ) and the line y = a ˆ + ˆbx. That is, uˆi = yi − y = yi − (ˆ a + ˆbxi ) . 2

Thus, the SSR is ∑ uˆ2i = ∑ [yi − (ˆ a + ˆbxi )] .

We wish to minimise the SSR, by choosing appropriate values of a ˆ and ˆb. This involves the following pair of first order conditions:84 ∂ ∑ uˆ2i = 0, ∂ˆ a

∂ ∑ uˆ2i = 0. ˆ ∂b

The remainder of the proof simply involves taking derivatives and doing the algebra, and is continued on p. 931 in the Appendices.

Remark 9. Whenever we simply say regression line or line of best fit, it may safely be assumed that we are talking about the OLS regression line.

84

There’s a bit of hand-waving here.

Page 732, Table of Contents

www.EconsPhDTutor.com

Example 612 (height and weight example revisited). We already calculated ¯ = 169.6, h

w¯ = 67.5,

¯ 2 = 1382.4, ∑ (hi − h) n

i=1

¯ (wi − w) ¯ = 1237. ∑ (hi − h) n

i=1

n ¯ (wi − w) ¯ ∑i=1 (hi − h) 1237 ˆ ≈ 0.8948. So, b = = 2 n 1382.4 ¯ ∑i=1 (hi − h)

Thus, the regression line is w − 67.5 = 0.8948 (h − 169.6) or w = a ˆ + ˆbh = −84.26 + 0.8948h.

90

Weight (kg)

85

4

80 8

75 70 65 60 55 50 45

Height (cm) 40 145

155

165

175

185

195

i 1 2 3 4 5 6 7 8 9 10 hi (cm) 182 165 173 155 178 174 169 160 150 190 81 70 71 53 72 75 69 60 44 80 wi (kg) wˆi (kg) 78.6 63.4 70.5 54.4 75.0 71.4 67.0 58.9 50.0 85.8 uˆi = wi − wˆi (kg) 2.4 6.6 0.5 −1.4 −3.0 3.6 2.0 1.1 −6.0 −5.8 10

The SSR for the actual line of best fit is ∑ uˆ2i = 2.42 + ⋅ ⋅ ⋅ + (−5.8)2 ≈ 147.6. This is much i=1

better than the SSR of 1317 that we found for the previous candidate line of best fit, which was simply a horizontal line.

Page 733, Table of Contents

www.EconsPhDTutor.com

Exercise 273. (a) Find the regression line of q on p, using the data below. (b) Complete the table. (c) Draw the scatter diagram, including the regression line and the corresponding residuals. (d) Compute the SSR. (Answer on p. 1119.) i 1 2 3 4 5 pi ($) 8 9 4 10 8 qi 300 250 1000 400 400 qˆi uˆi = qi − qˆi

Page 734, Table of Contents

www.EconsPhDTutor.com

73.6

TI84 to Calculate the PMCC and the OLS Estimates

Example 614. We’ll find the PMCC and the regression line for these data: i 1 2 3 4 5 xi 1 7 3 11 8 yi 14 5 6 4 4 1. Press ON to turn on your calculator. 2. Press the blue 2ND button and then CATALOG (which corresponds to the 0 button). This brings up the CATALOG menu. 3. Using the down arrow key ∨ , scroll down until the cursor is on DiagnosticOn.

4. Press ENTER once. And press ENTER a second time. The TI84 now says “DONE”, telling you that the Diagnostic option has been turned on. The above steps need only be performed once. Unless of course you’ve just reset your calculator (as is required before each exam). In which case you have to go through the above steps again. After Step 1.

After Step 2.

After Step 3.

After Step 4.

5. Press STAT to bring up the STAT menu. 6. Press 1 to select the “1:Edit” option. 7. The TI84 now prompts you to enter data under the column titled “L1”. This is where you should enter the data for x, using the numeric pad and the ENTER key as is appropriate. (I omit from this step the exact buttons you should press.) 8. After entering the last entry, press the right arrow key > to go to column L2. So enter the data for y, again using the numeric pad and the ENTER key as is appropriate. After Step 5.

After Step 6.

After Step 7.

After Step 8.

(... Example continued on the next page ...)

Page 735, Table of Contents

www.EconsPhDTutor.com

(... Example continued from the previous page ...) 9. Now press STAT to again bring up the STAT menu.

10. Press the right arrow key > to go to the CALC submenu. 11. Press 4 to select the “4:LinReg(ax+b)” option.

12. To tell the TI84 to go ahead and do the calculations, simply press ENTER . The TI84 tells you that the PMCC is r = −.8147656398. The equation of the regression line of y on x is y = ax + b = −.859375x + 11.75625. (Be careful to note that the TI84 uses the symbol “a” for the coefficient for x, whereas in the A-level List of Formulae, they use b instead. Don’t get these mixed up!) After Step 9.

After Step 10.

After Step 11.

After Step 12.

Exercise 274. Using your TI84, find the PMCC between q and p, and also find the regression line of q on p (see data below). Verify that your answer for this exercise is the same as those in the last two exercises. (Answer on p. 1120.) i 1 2 3 4 5 pi ($) 8 9 4 10 8 qi 300 250 1000 400 400

Page 736, Table of Contents

www.EconsPhDTutor.com

73.7

Interpolation and Extrapolation

Given any value of x, we call the corresponding yˆ = ˆb (x − x¯) + y¯ the fitted value or the predicted value. One use of the regression line is that it can help us predict (or “guess”) the value of y, even for x for which we have no data. Example 612 (height and weight example revisited). Say we want to guess the weight of an adult male Singaporean who is 185 cm tall. Using our regression line, we predict that his weight is wˆh=185 = 0.8948 × 185 − 84.26 ≈ 81.3 kg. This is called interpolation, because we are predicting the weight of a person whose height is between two of our observations. Say instead we want to guess the weight of an adult male Singaporean who is 210 cm tall. Using our regression line, we predict that his weight is wˆh=210 = 0.8948 × 210 − 84.26 ≈ 103.6 kg. This is called extrapolation, because we are predicting the weight of a person whose height is beyond on our rightmost observation. 1 2 3 4 5 6 7 8 9 10 i hi (cm) 182 165 173 155 178 174 169 160 150 190 185 210 wi (kg) 81 70 71 53 72 75 69 60 44 80 wˆi (kg) 78.6 63.4 70.5 54.4 75.0 71.4 67.0 58.9 50.0 85.8 81.3 103.6

110

Weight (kg) 6

100 90 80 70 60 50 Height (cm) 40 145

155

Page 737, Table of Contents

165

175

185

195

205

215

www.EconsPhDTutor.com

For the A-level exams, you are supposed to mindlessly and formulaically say that “Extrapolation is less reliable than interpolation”, because The former predicts what’s beyond the known observations; the latter predicts what’s between two known observations. This, though, is not a very satisfying explanation for why extrapolation is “less reliable” than interpolation. It merely leads to another question: “Why should a prediction be more reliable if done between two known observations, than if done to the right of the right-most observation (or to the left of the left-most observation)?” We won’t give an adequate answer to this latter question. Instead, we’ll simply give a bunch of examples to illustrate the dangers of extrapolation: Example 615. A man on a diet weighs 115 kg in Week #1. Here’s a chart of his weight loss.

The OLS line of best fit suggests that he has been losing about 0.5 kg a week. He forgot to record his weight on Week #6. By interpolation, we “predict” that his weight that week was 112.5 kg. This is probably a reliable guess. By extrapolation, we predict that his weight on Week #201 will be 15 kg. This guess is obviously absurd. It requires that he keeps losing 0.5 kg a week for nearly 4 years. Page 738, Table of Contents

www.EconsPhDTutor.com

Example 616. A growing boy is 160 cm tall in Month #1. Here’s a chart of his growth.

The OLS line of best fit suggests that he has been growing by about 1 cm a month. He forgot to record his height in Month #6. By interpolation, we “predict” that his height that month was 165 cm. This is probably a reliable guess. By extrapolation, we predict that his height in Month #101 will be 260 cm. This guess is obviously absurd. It requires that he keep growing by 1 cm a month for the 8-plus years.

Page 739, Table of Contents

www.EconsPhDTutor.com

Here are three colourful examples of the dangers of extrapolation from other contexts. Example 617. Russell’s Chicken (Problems of Philosophy, 1912, Google Books link). The man who has fed the chicken every day throughout its life at last wrings its neck instead, showing that more refined views as to the uniformity of nature would have been useful to the chicken. ... The mere fact that something has happened a certain number of times causes animals and men to expect that it will happen again. Thus our instincts certainly cause us to believe the sun will rise to-morrow, but we may be in no better a position than the chicken which unexpectedly has its neck wrung.

Example 618. The Fermat numbers are F0 = 22 + 1 = 3, 0

F1 = 22 + 1 = 5, 1

F2 = 22 + 1 = 17, 2

F3 = 22 + 1 = 257, 3

F4 = 22 + 1 = 65537. 4

Remarkably, the first five Fermat numbers are all prime. This observation led Fermat to conjecture (guess) in the 17th century that all Fermat numbers are prime. This was an act of extrapolation. Unfortunately, Fermat’s act of extrapolation was wrong. About a century later, Euler 5 showed that F5 = 22 + 1 = 4294967297 = 641 × 6700417 is composite (not prime).

Today, the Fermat numbers F5 , F6 , . . . , F32 are all known to be composite. Indeed, it was shown in 1964 that F32 is composite. Over half a century later, it is not yet known if F33 = 33 22 + 1 is prime or composite. F33 is an unimaginably huge number, with 2, 585, 827, 973 digits.

Page 740, Table of Contents

www.EconsPhDTutor.com

Example 619. On Ah Beng’s first day at school, he learns in Chinese class that the Chinese character for the number 1 is written as a single horizontal stroke. On his second day at school, he learns that the Chinese character for the number 2 is written as two horizontal strokes. On his third day at school, he learns that the Chinese character for the number 3 is written as three horizontal strokes.

The Chinese character for 1

The Chinese character for 2

The Chinese character for 3

After his third day at school, Ah Beng decides he’ll skip at least the next few Chinese classes, because he thinks he knows how to write the Chinese characters for the numbers 4 and above. 4 simply consists of four horizontal strokes; 5 simply consists of five horizontal strokes; etc. Unfortunately, Ah Beng’s act of extrapolation is wrong. The characters for the numbers 4 through 10 look instead like this:

4

Page 741, Table of Contents

5

6

7

8

9

10

www.EconsPhDTutor.com

On the other hand, here are two historical examples of extrapolation that, to everyone’s surprise, have held up remarkably well (at least to date). Example 620. Moore’s Law. In 1965, Gordon Moore observed that the number of components that could be crammed onto each integrated circuit doubled every year. He predicted that this rate of progress would continue at least through 1975.

NEWS FEATURE

In 1975, he adjusted his prediction to a more modest rate of doubling every two years. Thus far, this latter prediction has held up remarkably well. The following from Nature:

MOORE’S LORE For the past five decades, the number of transistors per microprocessor chip — a rough measure of processing power — has doubled about every two years, in step with Moore’s law (top). Chips also increased their ‘clock speed’, or rate of executing instructions, until 2004, when speeds were capped to limit heat. As computers increase in power and shrink in size, a new class of machines has emerged roughly every ten years (bottom). 1010

10 8

10 6

10 4 Transistors per chip 10 2

1

10 –2 1960

Clock speeds (MHz) 1974

1988

2002

2016

Unfortunately, as stated in the same Nature article, it “has become increasingly obvious to 13 everyone10involved” that “Moore’s law ... is nearing its end”. 1012 11 Page 742, 10 Table of Contents

Ma

in

m fr a

e www.EconsPhDTutor.com

B 90 to m sm T on tur sto exe ele rat S cur tha mo sm wit pra bro ficu im E ele chi the wh po com wh Cla is l

Example 621. Augustine’s Law. In 1983, Norman Augustine observed that the cost of a tactical aircraft grows four-fold every ten years. (Google Books.)

This is considerably quicker than the rate at which the annual US defense budget and US Gross National Product (GNP) grows. Extrapolating, he concluded: • In 2054, the entire annual US defense budget will be spent on a single aircraft. • Early in the 22nd century, the entire US GNP will be spent on a single aircraft.

(... Example continued on the next page ...)

Page 743, Table of Contents

www.EconsPhDTutor.com

(... Example continued from the previous page ...) These seemingly-absurd conclusions were written at least partly in jest. Except so far they have been right on track. In a 2010 Economist article, Augustine was quoted as saying, “We are right on target. Unfortunately nothing has changed.” That article also presented an updated version of Augustine’s Law. The latest F-35 fighter program is estimated to cost the US Department of Defense US$1.124 trillion. To be fair, that estimate is the cost of the entire program over its projected 60year lifespan (through 2070) — this includes R&D, the purchase of over 2, 000 F-35s, and operating costs. But still, US$1.124 trillion is a mind-blowing figure.*

*Figure quoted from an April 2016 Defense News story. Note though that the estimate keeps changing.

Exercise 275. Using the data below, “predict” how many haircuts were sold in June 2016 by (a) a barber who charged $7 per haircut; and (b) a barber who charged $200 per haircut. Which prediction is an act of interpolation and which is an act of extrapolation? Which prediction do you think is more reliable?(Answer on p. 1120.) i 1 2 3 4 5 pi ($) 8 9 4 10 8 qi 300 250 1000 400 400

Page 744, Table of Contents

www.EconsPhDTutor.com

73.8

Transformations to Achieve Linearity

Two variables may have a relationship, but not a linear one. Here we consider cases where the relationship is quadratic, reciprocal, or logarithmic. Example 622. Quadratic. Consider the following data. There is a very strong, but not perfect degree of linear correlation between x and y (r ≈ 0.950). The observations are very close to, but are not exactly on the OLS line of best fit.

Perhaps we can do better by transforming the data. We’ll do a quadratic transformation: let zi = x2i . Then we have the following data.

The degree of linear correlation between z and y is near perfect (r ≈ 0.995). The observations also lie closer to the line of best fit than before. Page 745, Table of Contents

www.EconsPhDTutor.com

Example 623. Reciprocal. Consider the following data. There seems to be a moderate degree of linear correlation between x and y (r ≈ −0.603). The observations are fairly close to the OLS line of best fit.

Perhaps we can do better by transforming the data. We’ll do a reciprocal transformation: let zi = 1/xi . Then we have the following data and scatter diagram.

The degree of linear correlation between z and y is much stronger (r ≈ 0.899). The observations also lie closer to the line of best fit.

Page 746, Table of Contents

www.EconsPhDTutor.com

Example 624. Logarithmic. Consider the following data. There seems to be a fairly strong degree of linear correlation between x and y (r ≈ 0.873). The observations are fairly close to, but are not exactly on the OLS line of best fit.

Perhaps we can do better by transforming the data. We’ll do a reciprocal transformation: let zi = ln xi . Then we have the following data and scatter diagram.

The degree of linear correlation between z and y is much stronger (r ≈ 0.978). The observations also lie closer to the line of best fit.

Page 747, Table of Contents

www.EconsPhDTutor.com

Exercise 276. You are given the following data. (Answer on p. 1121.) i 1 2 3 4 5 xi 1 2 3 4 5 yi 10.59 10.54 27.30 33.84 56.6 (a) Plot the above data in a scatter diagram and find the PMCC. (b) Apply an appropriate transformation to x. Plot the transformed data in a scatter diagram and find the PMCC.

Page 748, Table of Contents

www.EconsPhDTutor.com

73.9

The Higher the PMCC, the Better the Model?

There are no routine statistical questions, only questionable statistical routines. - Usually attributed to David Cox. It’s much more interesting to live not knowing than to have answers which might be wrong. - Richard Feynman (1981, YouTube). The A-level examiners85 want you to say, mindlessly and formulaically, that All else equal, a model with a higher PMCC is better than a model with a lower PMCC.

Regurgitating the above sentence will earn you your full mark. But in fact, without the “all else equal” clause, it is nonsense. And since it is almost never true that “all else is equal”, it is almost always nonsense. In every introductory course or text on statistics, one is told that the PMCC is merely a relatively-unimportant consideration, in deciding between models. Yet somehow, the A-level examiners seem to consider the PMCC an all-important consideration. Here’s a quick example to illustrate. Example 625. (From the 2015 exam — see Exercise 459 below.) In an experiment the following information was gathered about air pressure P , measured in inches of mercury, at different heights above sea-level h, measured in feet. h 2000 5000 10000 15000 20000 25000 30000 35000 40000 45000 P 27.8 24.9 20.6 16.9 13.8 11.1 8.89 7.04 5.52 4.28 The exam first asks us to find the PMCCs between (a) h and P ; (b) ln h and P ; and (c) √ h and P . The answers are (a) ra ≈ −0.980731; (b) rb ≈ −0.974800; and (c) rc ≈ −0.998638.

The A-level exam then says, “Using the most appropriate case ..., find the equation which √ best models air pressure at different heights.” The “correct” answer is that (c) P = a + b h is the “most appropriate” model, simply because the PMCC there is the largest. (... Example continued on the next page ...) 85

See 9740 N2015/II/10(iii), N2014/II/8(b)(ii), N2012/II/8(v), N2011/II/8(iii), N2010/II/10(iii), and N2008/II/8(i). These are given in this textbook as Exercises 459, 465, 480, 486, 496, and 508.

Page 749, Table of Contents

www.EconsPhDTutor.com

(... Example continued from the previous page ...) But this is utter nonsense. One does not conclude that one model is “more appropriate” than another simply because its PMCC is 0.018 larger. Small measurement errors or plain bad luck could easily explain these tiny differences in PMCCs. Moreover, even if one model has r = 0.9 and another has r = 0.4, it does not automatically follow that the first model is “more appropriate” than the second. In deciding which statistical model to use, there are very many considerations, of which the PMCC is a relatively-unimportant one. In my view, the correct answer should have been this: We have far too little information to make any conclusions. Sadly, in the Singapore education system, what I consider to be the correct answer would not have gotten you any marks. Instead, one is taught that there must always be one single, simplistic, formulaic, definitive, “correct” answer. This is a convenient substitute for thinking. As it turns out, the “most correct” linear model — based on the actual barometric formula (see subsection 85.10.1 in the Appendices) — is actually the following: ln P = a + b ln (1 +

L h) . T

The constants L = −0.0065 kelvin per metre (Km-1 ) and T = 288.15 kelvin (K) are, respectively, the standard temperature lapse rate (up to 11, 000 m above sea level) and the standard temperature (at sea level). The PMCC for the above model is rd ≈ 0.999998, which is “better” than the cases examined above. (See this Google spreadsheet for the data and calculations.)

But again, the PMCC is merely one relatively-unimportant √ consideration. Our conclusion that this last model is superior to the model P = a + b h is based not on the fact that rd is 0.001 larger than rc . Instead, we are confident in this √ model because it was derived from physical theories. In contrast, the model P = a + b h (or indeed any of the other models suggested above) is completely arbitrary and has no theoretical justification. Hence, even if the model P = √ a + b h had a PMCC of 1, we’d still prefer this last model. Page 750, Table of Contents

www.EconsPhDTutor.com

Part VII

Ten-Year Series For more practice, try the TYS questions for H1 Maths (in my H1 Maths Textbook). They’re very similar!

This part lists all the questions from 2006-2015 A-Level exams, sorted into the nine different parts and in reverse chronological order.

In the older exams, they had the habit of not distinctly numbering different parts within the same question as parts (i), (ii), etc. So I have sometimes taken the liberty of adding or modifying such numbers.

Exam Tip Unless explicitly instructed, you are always allowed to use your graphing calculator, so use it wherever possible. Examples of explicit instructions to avoid using your calculator include (but are not limited to): • “Without using your calculator ...” • “Use a non-calculator method ...” • “Find the exact value of ...” • “Express your answer in terms of

Page 751, Table of Contents

√

3 or π.”

www.EconsPhDTutor.com

74

Past-Year Questions for Part I: Functions and Graphs

Exercise 277. (9740 N2015/I/1. Answer on p. 1122.) A curve C has equation y=

a + bx + c, x2

where a, b and c are constants. It is given that C passes through the points with coordinates (1.6, −2.4) and (−0.7, 3.6), and that the gradient of C is 2 at the point where x = 1. (i) Find the values of a, b and c, giving your answers correct to 3 decimal places. [4]

(ii) Find the x-coordinate of the point where C crosses the x-axis, giving your answer correct to 3 decimal places. [2]

(iii) One asymptote of C is the line with equation x = 0. Write down the equation of the other asymptote of C. [1] Exercise 278. (9740 N2015/I/2. Answer on p. 1123.) (i) Sketch the curve with equation y=∣

x+1 ∣, 1−x

stating the equations of the asymptotes. On the same diagram, sketch the line with equation y = x + 2. [3] (ii) Solve the inequality

∣

x+1 ∣ < x + 2. 1−x

[3]

Exercise 279. (9740 N2015/I/5. Answer on p. 1125.) (i) State a sequence of transformations that will transform the curve with equation y = x2 on to the curve with equation y = 0.25(x − 3)2 . [2] A curve has equation y = f (x) where

⎧ ⎪ ⎪ 1 ⎪ ⎪ ⎪ ⎪ f (x) = ⎨0.25(x − 3)2 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩0

for 0 ≤ x ≤ 1,

for 1 < x ≤ 3, otherwise.

(ii) Sketch the curve for −1 ≤ x ≤ 4. [3]

(iii) On a separate diagram, sketch the curve with equation y = 1 + f (0.5x), for −1 ≤ x ≤ 4. [2] Page 752, Table of Contents

www.EconsPhDTutor.com

Exercise 280. (9740 N2015/II/3. Answer on p. 1126.) (a) The function f is defined by f ∶x→

(i) Show that f has an inverse. [2]

1 , 1 − x2

x ∈ R, x > 1.

(ii) Find f −1 (x) and state the domain of f −1 . [3] (b) The function g is defined by g∶x→

2+x , 1 − x2

x ∈ R, x ≠ ±1.

Find algebraically the range of g, giving your answer in terms of [5]

√

3 as simply as possible.

Exercise 281. (9740 N2014/I/1. Answer on p. 1126.) The function f is defined by y=

(i) Show that f 2 (x) = f −1 (x). [4]

1 , 1−x

x ∈ R, x ≠ 1, x ≠ 0.

(ii) Find f 3 (x) in simplified form. [1]

Page 753, Table of Contents

www.EconsPhDTutor.com

Exercise 282. (9740 N2014/I/4. Answer on p. 1127.) The diagram shows the curve y = f (x). The curve crosses the x-axis at the points A, B and C, and has a maximum turning point at D where it crosses the y-axis. The coordinates of A, B, C and D are (−a, 0), (b, 0), (c, 0) and (0, d) respectively, where a, b, c and d are positive constants.

y

D = (0, d) Vertical intercept

A = (-a, 0)

B = (b, 0)

C = (c, 0) x

y = f(x)

Horizontal intercepts

(i) Sketch the curve y 2 = f (x), stating, in terms of a, b, c and d, the coordinates of any turning points and of the points where the curve crosses the x-axis. [4]

(ii) What can be said about the tangents to the curve y 2 = f (x) at the points where it crosses the x-axis? [1] Exercise 283. (9740 N2014/II/1. Answer on p. 1128.) A curve C has parametric equations x = 3t2 , y = 6t. (i) Find the value of t at the point on C where the tangent has gradient 0.4. [3]

(ii) The tangent at the point P (3p2 , 6p) on C meets the y-axis at the point D. Find the cartesian equation of the locus of the mid-point of P D as p varies. [4] Page 754, Table of Contents

www.EconsPhDTutor.com

Exercise 284. (9740 N2013/I/2. Answer on p. 1128.) It is given that x2 + x + 1 y= , x−1

x ∈ R, x ≠ 1.

Without using a calculator, find the set of values that y can take. [5]

Exercise 285. (9740 N2013/I/3. Answer on p. 1129.) (i) Sketch the curve with equation y=

x+1 , 2x − 1

stating the equations of any asymptotes and the coordinates of the points where the curve crosses the axes. [4] (ii) Solve the inequality x+1 < 1. 2x − 1

[1]

Exercise 286. (9740 N2013/II/1. Answer on p. 1130.) Functions f and g are defined by f ∶x↦

2+x , 1−x

g ∶ x ↦ 1 − 2x,

x ∈ R, x ≠ 1, x ∈ R.

(i) Explain why the composite function f g does not exist. [2] (ii) Find an expression for gf (x) and hence, or otherwise, find (gf )−1 (5). [4]

Page 755, Table of Contents

www.EconsPhDTutor.com

Exercise 287. (9740 N2012/I/1. Answer on p. 1130.) A cinema sells tickets at three different prices, depending on the age of the customer. The age categories are under 16 years, between 16 and 65 years, and over 65 years. Three groups of people, A, B, and C, go to the cinema on the same day. The numbers in each category for each group, together with the total cost of the tickets for each group, are given in the following table. Group Under 16 years Between 16 and 65 years Over 65 years Total cost A 9 6 4 $162.03 B 7 5 3 $128.36 10 4 5 $158.50 C Write down and solve equations to find the cost of a ticket for each of the age categories. [4] Exercise 288. (9740 N2012/I/7. Answer on p. 1131.) A function f is said to be selfinverse if f (x) = f −1 (x) for all x in the domain of f . The function g is defined by

where k is a constant, k ≠ −1.

g∶x↦

x+k , x−1

x ∈ R, x ≠ 1,

(i) Show that g is self-inverse. [2] (ii) Given that k > 0, sketch the curve y = g(x), stating the equations of any asymptotes and the coordinates of any points where the curve crosses the x- and y-axes. [3]

(iii) State the equation of one line of symmetry of the curve in part (ii), and describe fully 1 a sequence of transformations which would transform the curve y = onto this curve. [4] x Exercise 289. (9740 N2013/I/3. Answer on p. 1133.) It is given that f (x) = x3 +x2 −2x−4.

(i) Sketch the graph of y = f (x). [1]

(ii) Find the integer solution of the equation f (x) = 4, and prove algebraically that there are no other real solutions. [3] (iii) State the integer solution of the equation (x + 3)3 + (x + 3)2 − 2(x + 3) − 4 = 4. [1] (iv) Sketch the graph of y = ∣f (x)∣. [1]

(v) Write down two different cubic equations which between them give the roots of the equation ∣f (x)∣ = 4. Hence find all the roots of this equation. [4] Page 756, Table of Contents

www.EconsPhDTutor.com

Exercise 290. (9740 N2011/I/1. Answer on p. 1134.) Without using a calculator, solve the inequality x2 + x + 1 < 0. x2 + x − 2

[4]

Exercise 291. (9740 N2011/I/2. Answer on p. 1134.) It is given that f (x) = ax2 + bx + c, where a, b, and c are constants.

(i) Given that the curve with equation y = f (x) passes through the points with coordinates (−1.5, 4.5), (2.1, 3.2) and (3.4, 4.1), find the values of a, b, and c. Give your answers correct to 3 decimal places. [3] (ii) Find the set of values of x for which f (x) is an increasing function.

Exercise 292. (9740 N2011/II/3. Answer on p. 1135.) The function f is defined by f ∶ x ↦ ln(2x + 1) + 3,

1 x ∈ R, x > − . 2

(i) Find f −1 (x) and write down the domain and range of f −1 . [4]

(ii) Sketch on the same diagram the graphs of y = f (x) and y = f −1 (x) giving the equations of any asymptotes and the exact coordinates of any points where the curves cross the xand y-axes. [4] (iii) Explain why the x-coordinates of the points of intersection of the curves in part (ii) satisfy the equation ln(2x + 1) = x − 3, and find the values of these x-coordinates, correct to 4 significant figures. [3] Exercise 293. (9740 N2010/I/5. Answer on p. 1136.) The curve with equation y = x3 is transformed by a translation of 2 units in the positive x-direction, followed by a stretch with scale factor 0.5 parallel to the y-axis, followed by a translation of 6 units in the negative y-direction. (i) Find the equation of the new curve in the form y = f (x) and the exact coordinates of the points where this curve crosses the x- and y-axes. Sketch the new curve. [5] (ii) On the same diagram, sketch the graph of y = f −1 (x), stating the exact coordinates of the points where the graph crosses the x- and y-axes. [3]

Page 757, Table of Contents

www.EconsPhDTutor.com

Exercise 294. (9740 N2010/II/4. Answer on p. 1137.) The function f is defined as follows. f ∶x↦

1 , x2 − 1

(i) Sketch the graph of y = f (x). [1]

for x ∈ R, x ≠ −1, x ≠ 1.

(ii) If the domain of f is further restricted to x ≥ k, state with a reason the least value of k for which the function f −1 exists. [2] In the rest of the question, the domain of f is x ∈ R, x ≠ −1, x ≠ 1, as originally defined. The function g is defined as follows.

(iii) Show that

g∶x↦

1 , x−3 f g(x) =

(iv) Solve the inequality f g(x) > 0. [3]

for x ∈ R, x ≠ 2, x ≠ 3, x ≠ 4. (x − 3)2 . (4 − x)(x − 2)

[2]

(v) Find the range of f g. [3]

Exercise 295. (9740 N2009/I/1. Answer on p. 1139.) (i) The first three terms of a sequence are given by u1 = 10, u2 = 6, u3 = 5. Given that un is a quadratic polynomial in n, find un in terms of n. [4] (ii) Find the set of values of n for which un is greater than 100. [2]

Exercise 296. (9740 N2009/I/6. Answer on p. 1140.) The curve C1 has equation y = x−2 x2 y 2 . The curve C2 has equation + = 1. x+2 6 3

(i) Sketch C1 and C2 on the same diagram, stating the exact coordinates of any points of intersection with the axes and the equations of any asymptotes. [4] (ii) Show algebraically that the x-coordinates of the points of intersection of C1 and C2 satisfy the equation 2(x − 2)2 = (x + 2)2 (6 − x2 ). [2] (iii) Use your calculator to find these x-coordinates. [2]

Page 758, Table of Contents

www.EconsPhDTutor.com

Exercise 297. (9740 N2009/II/3. Answer on p. 1141.) The function f is defined by f ∶x↦

a for x ∈ R, x ≠ , b

ax , bx − a

where a and b are non-zero constants.

(i) Find f −1 (x). Hence or otherwise find f 2 (x) and state the range of f 2 . [5]

1 for all real non-zero x. State whether the x composite function f g exists, justifying your answer. [2] (ii) The function g is defined by g ∶ x ↦

(iii) Solve the equation f −1 (x) = x.

Exercise 298. (9740 N2008/I/9. Answer on p. 1142.) It is given that

for non-zero constants a, b, c, and d.

f (x) =

ax + b , cx + d

(i) Given that ad − bc ≠ 0, show by differentiation that the graph of y = f (x) has no turning points. [3] (ii) What can be said about the graph of y = f (x) when ad − bc = 0? [2] (iii) Deduce from part (i) that the graph of y=

3x − 7 2x + 1

has a positive gradient at all points of the graph. [1] (iv) On separate diagrams, draw sketches of the graphs of (a) (b)

y=

y2 =

3x − 7 , 2x + 1 3x − 7 , 2x + 1

including the coordinates of the points where the graphs cross the axes and the equations of any asymptotes. [5] Page 759, Table of Contents

www.EconsPhDTutor.com

Exercise 299. (9233 N2008/I/14. Answer on p. 1143.) Sketch, on separate diagrams, the curves (i) y =

x , stating the equations of the asymptotes, [4] −1 x (ii) y 2 = 2 , making clear the form of the curve at the origin. [3] x −1 x2

Show that the x-coordinates of the points of intersection of the curves y =

x2

x and y = ex −1

satisfy the equation x2 = 1 + xe−x . √ Use the iterative formula xn+1 = 1 + xn e−xn , together with a suitable initial value x1 , to find the positive root of this equation correct to 2 decimal places. Exercise 300. (9740 N2008/II/4. Answer on p. 1144.) The function f is defined by f ∶ x ↦ (x − 4)2 + 1 for x ∈ R, x > 4.

(i) Sketch the graph of y = f (x). Your sketch should indicate the position of the graph in relation to the origin. [2] (ii) Find f −1 (x), stating the domain of f −1 . [3]

(iii) On the same diagram as in part (i), sketch the graph of y = f −1 (x). [1]

(iv) Write down the equation of the line in which the graph of y = f (x) must be reflected in order to obtain the graph of y = f −1 (x), and hence find the exact solution of the equation f (x) = f −1 (x). [5] Exercise 301. (9740 N2007/I/1. Answer on p. 1145.) Show that x2 − 4x − 21 2x2 − x − 19 − 1 = . x2 + 3x + 2 x2 + 3x + 2

[1]

Hence, without using a calculator, solve the inequality 2x2 − x − 19 > 1. x2 + 3x + 2

Page 760, Table of Contents

[4]

www.EconsPhDTutor.com

Exercise 302. (9740 N2007/I/2. Answer on p. 1145.) Functions f and g are defined by f ∶x↦

1 x−3

g ∶ x ↦ x2

for x ∈ R,x ≠ 3,

for x ∈ R.

(i) Only one of the composite functions f g and gf exists. Give a definition (including the domain) of the composite that exists, and explain why the other composite does not exist. [3] (ii) Find f −1 (x) and state the domain of f −1 . [3] 2x + 7 Exercise 303. (9740 N2007/I/5. Answer on p. 1146.) Show that the equation y = x+2 B can be written as y = A + , where A and B are constants to be found. Hence state a x+2 1 2x + 7 sequence of transformations which transform the graph of y = to the graph of y = . x x+2 [4] 2x + 7 , giving the equations of any asymptotes and the coordinates Sketch the graph of y = x+2 of any points of intersection with the x- and y-axes.

Exercise 304. (9740 N2007/II/1. Answer on p. 1147.) Four friends buys three different kinds of fruits in the market. When they get home they cannot remember the individual prices per kilogram, but three of them can remember the total amount that they each paid. The weights of fruit and the total amounts paid are shown in the following table. Suresh Fandi Cindy Lee Lian Pineapples (kg) 1.15 1.20 2.15 1.30 Mangoes (kg) 0.60 0.45 0.90 0.25 Lychees (kg) 0.55 0.30 0.65 0.50 Total amount paid in $ 8.28 6.84 13.05 Assuming that, for each variety of fruit, the price per kilogram paid by each of the friends is the same, calculate the total amount that Lee Lian paid. [6]

Page 761, Table of Contents

www.EconsPhDTutor.com

Exercise 305. (9233 N2007/II/4. Answer on p. 1147.) The function f is defined by f ∶x↦

4x + 1 , x−3

x ∈ R, x ≠ 3.

(i) State the equations of the two asymptotes of the graph of y = f (x). [2]

(ii) Sketch the graph of y = f (x), showing its asymptotes and stating the coordinates of the points of intersection with the axes. [3] (iii) Find an expression for f −1 (x) and state the domain of f −1 . [3]

Exercise 306. (9233 N2006/I/3. Answer on p. 1148.) Functions f and g are defined by f ∶ x ↦ 5x + 3, 3 g∶x↦ , x

x > 0,

x > 0.

Find, in a similar form, f g, g 2 and g 35 . [3] [Note: g 2 denotes gg.]

Express h in terms of one or both f and g, where h ∶ x ↦ 25x + 18,

x > 0.

[1]

Exercise 307. (9233 N2006/II/1. Answer on p. 1149.) Solve the inequality x−9 ≤ 1. x2 − 9

[5]

N Remark: This was the old 9233 exam. Technically, inequalities of the form ≥ 0 are no D N longer in the 9740 or 9758 syllabus. Only inequalities of the form > 0 are. But this is D not a difficult question and you should go ahead and try it.

Page 762, Table of Contents

www.EconsPhDTutor.com

75

Past-Year Questions for Part II: Sequences and Series

Exercise 308. (9740 N2015/I/8. Answer on p. 1150.) Two athletes are to run 20 km by running 50 laps around a circular track of length 400 m. They aim to complete the distance in between 1.5 hours and 1.75 hours inclusive. (i) Athlete A runs the first lap in T seconds and each subsequent lap takes 2 seconds longer than the previous lap. Find the set of values of T which will enable A to complete the distance within the required time interval. [4] (ii) Athlete B runs the first lap in t seconds and the time for each subsequent lap is 2% more than the time for the previous lap. Find the set of values of t which will enable B to complete the distance within the required time interval. [4] (iii) Assuming each athlete completes the 20 km run in exactly 1.5 hours, find the difference in the athletes’ times for their 50th laps, giving your answer to the nearest second. [3]

Exercise 309. (9740 N2015/II/4. Answer on p. 1151.) Skip part (a) if you’re taking the 9758 (revised) exam. (a) Prove by the method of mathematical induction that 1 × 3 × 6 + 2 × 4 × 7 + 3 × 5 × 8 + ⋅ ⋅ ⋅ + n(n + 2)(n + 5) =

1 n(n + 1)(3n2 + 31n + 74). [6] 12

2 A B can be expressed as + , where A and B are + 8r + 3 2r + 1 2r + 3 constants to be determined. [1] (b) (i) Show that

4r2

2 is denoted by Sn . 2 r=1 4r + 8r + 3 n

The sum ∑

(ii) Find an expression for Sn in terms of n. (iii) Find the smallest value of n for which Sn is within 10−3 of the sum to infinity. [3]

Page 763, Table of Contents

www.EconsPhDTutor.com

Exercise 310. (9740 N2014/I/6. Answer on p. 1153.) Skip part (a) if you’re taking the 9758 (revised) exam. (a) A sequence p1 , p2 , p3 , . . . is given by p1 = 1 and pn+1 = 4pn − 7 for n ≥ 1.

7 − 4n (i) Use the method of mathematical induction to prove that pn = . [5] 3 n

(ii) Find ∑ pr . [3] r=1

(b) The sum Sn of the first n terms of a sequence u1 , u2 , u3 , . . . is given by Sn = 1−

1 . (n + 1)!

(i) Give a reason why the series ∑ ur converges, and write down the value of the sum to infinity. [2] (ii) Find a formula for un in simplified form. [2]

Exercise 311. (9740 N2014/II/3. Answer on p. 1154.) In a training exercise, athletes run from a starting point O to and from a series of points A1 , A2 , A3 , . . . , increasingly far away in a straight line. In the exercise, athletes start at O and run stage 1 from O to A1 and back to O, then stage 2 from O to A2 and back to O, and so on.

Version 1 of the exercise O

4m

A1

4m

A2

4m

A3

4m

A4

4m

A5

4 m A6 4 m

A7

4m

A8

Version 2 of the exercise O

4m

A1

4m

A2

8m

A3

16 m

A4

(i) In Version 1 of the exercise (top), the distances between adjacent points are all 4 m. (a) Find the distance run by an athlete who completes the first 10 stages of Version 1 of the exercise. [2] (b) Write down an expression for the distance run by an athlete who completes n stages of Version 1. Hence find the least number of stages that the athlete needs to complete to run at least 5 km. [4] (ii) In Version 2 of the exercise (bottom), the distances between the points are such that OA1 = 4 m, A1 A2 = 4 m, A2 A3 = 8 m and An An+1 = 2An−1 An . Write down an expression for the distance run by an athlete who completes n stages of Version 2. Hence find the distance from O, and the direction of travel, of the athlete after he has run exactly 10 km using Version 2. [5] Page 764, Table of Contents

www.EconsPhDTutor.com

Exercise 312. (9740 N2013/I/7. Answer on p. 1155.) A gardener is cutting off pieces of string from a long roll of string. The first piece he cuts off is 128 cm long and each successive piece is 2/3 as long as the preceding piece. (i) The length of the nth piece of string cut off is p cm. Show that ln p = (An + B) ln 2 + (Cn + D) ln 3, for constants A, B, C and D to be determined. [3] (ii) Show that the total length of string cut off can never be greater than 384 cm. [2]

(iii) How many pieces must be cut off before the total length cut off is greater than 380 cm? You must show sufficient working to justify your answer. [4] Exercise 313. (9740 N2013/I/9. Answer on p. 1156.) Skip this question if you’re taking the 9758 (revised) exam. (i) Prove by the method of mathematical induction that 1 ∑ r(2r2 + 1) = n(n + 1)(n2 + n + 1). 2 r=1 n

[5]

(ii) It is given that f (r) = 2r3 + 3r2 + r + 24. Show that f (r) − f (r − 1) = ar2 , for a constant n

a to be determined. Hence find a formula for ∑ r2 , fully factorizing your answer. [5] r=1

n

(iii) Find ∑ f (r). (You should not simplify your answer.) [3] r=1

Exercise 314. (9740 N2012/I/3. Answer on p. 1158.) Skip this question if you’re taking the 9758 (revised) exam. A sequence u1 , u2 , u3 , . . . is given by u1 = 2 and un+1 =

(i) Find the exact values of u2 and u3 . [2]

3un − 1 for n ≥ 1. 6

(ii) It is given that un → l as n → ∞. Showing your working, find the exact value of l. [2]

(iii) For this value of l, use the method of mathematical induction to prove that 14 1 n un = ( ) + l. 3 2

Page 765, Table of Contents

[4]

www.EconsPhDTutor.com

Exercise 315. (9740 N2012/II/4. Answer on p. 1159.) On 1 January 2001 Mrs A put $100 into a bank account, and on the first day of each subsequent month she put in $10 more than in the previous month. Thus on 1 February she put $110 into the account and on 1 March she put $120 into the account, and so on. The account pays no interest. (i) On what date did the value of Mrs A’s account first become greater than $5000? [5] On 1 January 2001 Mr B put $100 into a savings account, and on the first day of each subsequent month he put another $100 into the account. The interest rate was 0.5% per month, so that on the last day of each month the amount in the account on that day was increased by 0.5%. (ii) Use the formula for the sum of a geometric progression to find an expression for the value of Mr B’s account on the last day of the nth month (where January 2001 was the 1st month, February 2001 was the 2nd month, and so on). Hence find in which month the value of Mr B’s account first became greater than $5000. [5] (iii) Mr B wanted the value of his account to be $5000 on 2 December 2003. What interest rate per month, applied from January 2001, would achieve this? [3]

Exercise 316. (9740 N2011/I/6. Answer on p. 1160.) Skip parts (ii) and (iii) of this question if you’re taking the 9758 (revised) exam. (i) Using the formulae for sin(A ± B), prove that

sin(r + 0.5)θ − sin(r − 0.5)θ = 2 cos(rθ) sin(0.5θ). [2] n

(ii) Hence find a formula for ∑ cos(rθ) in terms of sin(n + 0.5)θ and sin(0.5θ). [3] r=1

(iii) Prove by the method of mathematical induction that n

∑ sin(rθ) =

r=1

cos(0.5θ) − cos(n + 0.5)θ 2 sin(0.5θ)

for all positive integers n. [6]

Page 766, Table of Contents

www.EconsPhDTutor.com

Exercise 317. (9740 N2011/I/9. Answer on p. 1162.) (i) A company is drilling for oil. Using machine A, the depth drilled on the first day is 256 metres. On each subsequent day, the depth drilled is 7 metres less than on the previous day. Drilling continues daily up to and including the day when a depth of less than 10 metres is drilled. What depth is drilled on the 10th day, and what is the total depth when drilling is completed? [6] Using machine B, the depth drilled on the first day is also 256 metres. On each subsequent 8 day, the depth drilled in of the depth drilled on the previous day. How many days does 9 it take for the depth drilled to exceed 99% of the theoretical maximum total depth? [4]

Exercise 318. (9740 N2010/I/3. Answer on p. 1162.) Skip part (ii) of this question if you’re taking the 9758 (revised) exam. The sum Sn of the first n terms of a sequence u1 , u2 , u3 , . . . is given by Sn = n(2n + c), where c is a constant. (i) Find un in terms of c and n. [3]

(ii) Find a recurrence relation of the form un+1 = f (un ). [2] Exercise 319. (9740 N2010/II/2. Answer on p. 1163.) Skip part (i) of this question if you’re taking the 9758 (revised) exam. (i) Prove by mathematical induction that 1 ∑ r(r + 2) = n(n + 1)(2n + 7). 6 r=1 n

[5]

(ii) Prove by the method of differences that 3 1 1 1 = − − . 4 2(n + 1) 2(n + 2) r=1 r(r + 2) n

∑

Page 767, Table of Contents

[4]

www.EconsPhDTutor.com

Exercise 320. (9740 N2009/I/3. Answer on p. 1165.) (i) Show that 1 2 1 A − + = 3 , n−1 n n+1 n −n

where A is a constant to be found. [2] (ii) Hence find

1 . 3 r=2 r − r n

∑

(There is no need to express your answer as a single algebraic fraction.) [3] 1 converges, and write down its value. [2] 3 r=2 r − r n

(iii) Give a reason why the series ∑

Exercise 321. (9740 N2009/I/5. Answer on p. 1166.) Skip (i) if you’re taking the 9758 (revised) exam. (i) Use the method of mathematical induction to prove that 1 ∑ r2 = n(n + 1)(2n + 1). 6 r=1 n

[4]

2n

(ii) Find ∑ r2 , giving your answer in fully factorized form. [4] r=n+1

Exercise 322. (9740 N2009/I/8. Answer on p. 1167.) Two musical instruments, A and B, consist of metal bars of decreasing lengths. (i) The first bar of instrument A has length 20 cm and the lengths of the bars form a geometric progression. The 25th bar has length 5 cm. Show that the total length of all the bars must be less than 357 cm, no matter how many bars there are. [4] Instrument B consists of only 25 bars which are identical to the first 25 bars of instrument A. (ii) Find the total length, L cm, of all the bars of instrument B and the length of the 13th bar. [3] (iii) Unfortunately the manufacturer misunderstands the instructions and constructs instrument B wrongly, so that the lengths of the bars are in arithmetic progression with common difference d cm. If the total length of the 25 bars is still L cm and the length of the 25th bar is still 5 cm, find the value of d and the length of the longest bar. [4] Page 768, Table of Contents

www.EconsPhDTutor.com

Exercise 323. (9740 N2008/I/2. Answer on p. 1168.) Skip this question if you’re taking the 9758 (revised) exam. The nth term of a sequence is given by un = n(2n + 1), for n ≥ 1. The sum of the first n terms is denoted by Sn . Use the method of mathematical induction to show that Sn = 1/6n(n + 1)(4n + 5) for all positive integers n. [5] Exercise 324. (9740 N2008/I/10. Answer on p. 1169.) (i) A student saves $10 on 1 January 2009. On the first day of each subsequent month she saves $3 more than in the previous month, so that she saves 13 on 1 February 2009, $16 on 1 March 2009, and so on. On what date will she first have saved over $2000 in total? (ii) A second students puts $10 on 1 January 2009 into a bank account which pays compound interest at a rate of 2% per month on the last day of each month. She puts a further $10 into the account on the first day of each subsequent month. (a) How much compound interest has her original $10 earned at the end of 2 years? [2] (b) How much in total is in the account at the end of 2 years? [3] (c) After how many complete months will the total in the account first exceed $2000? [4]

Exercise 325. (9233 N2008/I/14. Answer on p. 1170.) Skip part (i) of this question if you’re taking the 9758 (revised) exam. It is required to prove the statement: 1 + 2x + 3x + ⋅ ⋅ ⋅ + nx 2

n−1

1 − (n + 1)xn + nxn+1 . = (1 − x)2

(i) Use mathematical induction to prove the statement for all positive integers n. (ii) By considering the expression obtained by integrating each term on the left hand side, prove the statement without using mathematical induction. [6] Exercise 326. (9233 N2008/II/2. Answer on p. 1172.) An arithmetic progression and a 1 1 geometric progression each have first term . The sum of their second terms is and the 2 2 1 sum of their third terms is . Given that the geometric progression is convergent, find its 8 sum to infinity. [6]

Page 769, Table of Contents

www.EconsPhDTutor.com

Exercise 327. (9740 N2007/I/9. Answer on p. 1173.) Technically this question Exercise 327. (9740 N2007/I/9. Answer on p. 1173.) Technically this question recurrence relations, so you can skip it if you’re taking the 9758 (revised) exam. recurrence relations, so you can skip it if you’re taking the 9758 (revised) exam. perfectly doable even using only what you learnt in 9758. perfectly doable even using only what you learnt in 9758.

involves involves But it’s But it’s

The diagram shows the graph of y = exx − 3x. The two roots of the equation exx − 3x = 0 are denoted by α and β, where α < β. (i) Find the values of α and β, each correct to 3 decimal places. [2]

A sequence of real numbers x11, x22, x33, . . . satisfies the recurrence relation 1 xxn n xn+1 n+1 = e , 3

for n ≥ 1.

(ii) Prove algebraically that, if the sequence converges, then it converges to either α or β. [2] (iii) Use a calculator to determine the behaviour of the sequence for each of the cases x11 = 0, x11 = 1, x11 = 2. [3] (iv) By considering xn+1 − xn , prove that xn+1 < xn xn+1 > xn

if α < xn < β, if xn < α or xn > β.

[2]

(v) State briefly how the results in part (iv) relate to the behaviours determined in (iii). Page 770, Table of Contents Page 770, Table of Contents

www.EconsPhDTutor.com www.EconsPhDTutor.com

Exercise 328. (9740 N2007/I/10. Answer on p. 1174.) A geometric series has common ratio r, and an arithmetic series has first term a and common difference d, where a and d are non-zero. The first three terms of the geometric series are equal to the first, fourth and sixth terms respectively of the arithmetic series. (i) Show that 3r2 − 5r + 2 = 0. [4]

(ii) Deduce that the geometric series is convergent and find, in terms of a, the sum to infinity. [5] (iii) The sum of the first n terms of the arithmetic series is denoted by S. Given that a > 0, find the set of possible values of n for which S exceeds 4a. [5] Exercise 329. (9740 N2007/II/2. Answer on p. 1175.) Skip part (i) of this question if you’re taking the 9758 (revised) exam. A sequence u1 , u2 , u3 , . . . is such that u1 = 1 and

for all n ≥ 1.

un+1 = un −

2n + 1 , n2 (n + 1)2

(i) Use the method of mathematical induction to prove that un =

1 . [4] n2

(ii) Hence find

N

∑

n=1

2n + 1

n2 (n + 1)2

.

[2]

(iii) Give a reason why the series in part (ii) is convergent and state the sum to infinity. [2] (iv) Use your answer to part (ii) to find 2n − 1 . 2 2 n=2 n (n − 1) N

∑

Page 771, Table of Contents

[2]

www.EconsPhDTutor.com

Exercise 330. (9233 N2007/I/14. Answer on p. 1176.) Skip this question if you’re taking the 9758 (revised) exam. Use the method of mathematical induction to prove the following result. n

∑ sin rx =

r=1

where 0 < x < 2π. [6]

cos(0.5x) − cos [(n + 0.5)x] , 2 sin(0.5x)

Exercise 331. (9233 N2007/II/1. Answer on p. 1177.) Find 2n

∑ 3r+2 . [3]

r=1

Exercise 332. (9233 N2006/I/1. Answer on p. 1177.) The sum Sn of the first n terms 2 of a geometric progression is given by Sn = 6 − n−1 . Find the first term and the common 3 ratio. Exercise 333. (9233 N2006/I/11. Answer on p. 1178.) Skip (i) if you’re taking the 9758 (revised) exam. (i) Prove by induction that 1 ∑ r3 = n2 (n + 1)2 . [4] 4 r=1 n

(ii) Deduce that 23 + 43 + 63 + ⋅ ⋅ ⋅ + (2n)3 = 2n2 (n + 1)2 . [1] n

Hence or otherwise find ∑(2r − 1)3 , simplifying your answer. [4] r=1

Page 772, Table of Contents

www.EconsPhDTutor.com

76

Past-Year Questions for Part III: Vectors

Exercise 334. (9740 N2015/I/7. Answer on p. 1179.) Referred to the origin O, points A and B have position vectors a and b respectively. Point C lies on OA, between O and A, such that OC ∶ CA = 3 ∶ 2. Point D lies on OB, between O and B, such that OD ∶ DB = 5 ∶ 6. Ð→ ÐÐ→ (i) Find the position vectors OC and OD, giving your answers in terms of a and b. [2]

(ii) Show that the vector equation of the line BC can be written as r = 0.6λa + (1 − λ)b, where λ is a parameter. Find in a similar form the vector equation of the line AD in terms of a parameter µ. [3] (iii) Find, in terms of a and b, the position vector of the point E where the lines BC and AD meet and find the ratio AE ∶ ED. [5] Exercise 335. (9740 N2015/II/2. Answer on p. 1180.) The line L has equation r = i − 2j − 4k + λ(2i + 3j − 6k). (i) Find the acute angle between L and the x-axis. [2] The point P has position vector 2i + 5j − 6k.

(ii) Find the points on L which are a distance of point on L which is closest to P . [5]

√

33 from P . Hence or otherwise find the

(iii) Find a cartesian equation of the plane that includes the line L and the point P . [3] Exercise 336. (9740 N2014/I/3. Answer on p. 1181.) (i) Given that a × b = 0, what can be deduced about the vectors a and b? [2] (ii) Find a unit vector n such that n × (i + 2j − 2k) = 0. [2]

(iii) Find the cosine of the acute angle between i + 2j − 2k and the z-axis. [1]

Page 773, Table of Contents

www.EconsPhDTutor.com

Exercise 337. (9740 N2014/I/9. Answer on p. 1181.) Planes p and q are perpendicular. Plane p has equation x + 2y − 3z = 12. Plane q contains the line l with equation x−1 y+1 z−3 = = . 2 −1 4

The point A on l has coordinates (1, −1, 3).

(i) Find a cartesian equation of q. [4]

(ii) Find a vector equation of the line m where p and q meet. [4] (iii) B is a general point on m. Find an expression for the square of the distance AB. Hence, or otherwise, find the coordinates of the point on m which is nearest to A. [5] Exercise 338. (9740 N2013/I/1. Answer on p. 1181.) Skip this question if you’re taking the 9758 (revised) exam. Planes p, q and r have equations x − 2z = 4, 2x − 2y + z = 6 and 5x − 4y + µz = −9 respectively, where µ is a constant.

(i) Given that µ = 3, find the coordinates of the point of intersection of p, q and r. (ii) Given instead that µ = 0, describe the relationship between p, q and r.

Page 774, Table of Contents

www.EconsPhDTutor.com

Exercise 339. (9740 N2013/I/6. Answer on p. 1182.) Skip part (iii) of this question if you’re taking the 9758 (revised) exam.

A N

a

C c

b O

B

M

Ð→ Ð→ The origin O and the points A, B and C lie in the same plane, where OA = a, OB = b and Ð→ OC = c (see diagram). (i) Explain why c can be expressed as c = λa + µb, for constants λ and µ. [1]

The point N is on AC such that AN ∶ N C = 3 ∶ 4.

(ii) Write down the position vector of N in terms of a and c. [1] (iii) It is given that the area of triangle ON C is equal to the area of triangle OM C, where M is the mid-point of OB. By finding the areas of these triangles in terms of a and b, find λ in terms of µ in the case where λ and µ are both positive. [5]

Page 775, Table of Contents

www.EconsPhDTutor.com

Exercise 340. (9740 N2013/II/4. Answer on p. 1183.) The planes p1 and p2 have equations r ⋅ (2, −2, 1) = 1 and r ⋅ (−6, 3, 2) = −1 respectively, and meet in the line l.

(i) Find the acute angle between p1 and p2 . [3] (ii) Find a vector equation for l. [4]

(iii) The point A(4, 3, c) is equidistant from the planes p1 and p2 . Calculate the two possible values of c. [6]

Exercise 341. (9740 N2012/I/5. Answer on p. 1184.) Skip part (i) of this question if you’re taking the 9758 (revised) exam. Referred to the origin O, the points A and B have position vectors a and b such that a = i − j + k and b = i + 2j. The point C has position vector c given by c = λa + µb, where λ and µ are positive constants. √ (i) Given that the area of triangle OAC is 126, find µ. [4] √ (ii) Given instead that µ = 4 and that OC = 5 3, find the possible coordinates of C. [4] Exercise 342. (9740 N2012/I/9. Answer on p. 1184.) (i) Find a vector equation of the line through the points A and B with position vectors 7i+8j+9k and −i−8j+k respectively.

(ii) The perpendicular to this line from the point C with position vector i + 8j + 3k meets the line at the point N . Find the position vector of N and the ratio AN ∶ N B. [5]

Page 776, Table of Contents

www.EconsPhDTutor.com

Exercise 343. (9740 N2011/I/7. Answer on p. 1185.) Skip part (i) of this question if you’re taking the 9758 (revised) exam.

A

P M

O

Q

B

Ð→ Ð→ Referred to the origin O, the points A and B are such that OA = a and OB = b. The point P on OA is such that OP ∶ P A = 1 ∶ 2, and the point Q on OB is such that OQ ∶ QB = 3 ∶ 2. The mid-point of P Q is M (see diagram).

ÐÐ→ (i) Find OM in terms of a and b and show that the area of triangle OM P can be written as k ∣a × b∣, where k is a constant to be found. [6] (ii) The vectors a and b are now given by a = 2pi − 6pj + 3pk and b = i + j − 2k, where p is a positive constant. Given that a is a unit vector, (a) find the exact value of p, [2]

(b) give a geometrical interpretation of ∣a ⋅ b∣, [1]

(c) evaluate a × b. [2]

Page 777, Table of Contents

www.EconsPhDTutor.com

Exercise 344. (9740 N2011/I/11. Answer on p. 1186.) The plane p passes through the points with coordinates (4, −1, −3), (−2, −5, 2) and (4, −3, −2). (i) Find a cartesian equation of p. [4]

The lines l1 and l2 have, respectively, equations x−1 y−2 z+3 = = 2 −4 1

and

x+2 y−1 z−3 = = , 1 5 k

where k is a constant. It is given that l1 and l2 intersect. (ii) Find the value of k. [4]

(iii) Show that l1 lies in p and find the coordinates of the point at which l2 intersects p. [4] (iv) Find the acute angle between l2 and p. [3]

Exercise 345. (9740 N2010/I/1. Answer on p. 1187.) The position vectors a and b are given by a = 2pi + 3pj + 6pk and b = i − 2j + 2k, where p > 0, It is given that ∣a∣ = ∣b∣. (i) Find the exact value of p. [2]

(ii) Show that (a + b) ⋅ (a − b) = 0. [3]

Exercise 346. (9740 N2010/I/10. Answer on p. 1187.) Skip part (iv) of this question if you’re taking the 9758 (revised) exam. The line l and plane p have, respectively, equations x − 10 y + 1 z + 3 = = −3 6 9

(i) Show that l is perpendicular to p. [2]

and x − 2y − 3z = 0.

(ii) Find the coordinates of the point of intersection of l and p. [4] (iii) Show that the point A with coordinates (−2, 23, 33) lies on l. Find the coordinates of the point B which is the mirror image of A in p. [3] (iv) Find the area of triangle OAB, where O is the origin, giving your answer to the nearest whole number. [3]

Page 778, Table of Contents

www.EconsPhDTutor.com

Exercise 347. (9740 N2009/I/10. Answer on p. 1188.) The planes p1 and p2 have equations r ⋅ (2, 1, 3) = 1 and r ⋅ (−1, 2, 1) = 2 respectively, and meet in a line l. (i) Find the acute angle between p1 and p2 . [3]

(ii) Find a vector equation of l. [4] (iii) The plane p3 has equation 2x + y + 3z − 1 + k(−x + 2y + z − 2) = 0. Explain why l lies in p3 for any constant k. Hence, or otherwise, find a cartesian equation of the plane in which both l and the point (2, 3, 4) lie. [5] Exercise 348. (9740 N2009/II/2. Answer on p. 1188.) Skip part (iv) of this question if you’re taking the 9758 (revised) exam. Relative to the origin O, two points A and B have position vectors given by a = 14i+14j+14k and b = 11i − 13j + 2k respectively. (i) The point P divides the line AB in the ratio 2 ∶ 1. Find the coordinates of P . [2]

(Note: They should have said line segment rather than line. About this, see section 26.1 — “Lines vs. Line Segments” on p. 247.) (ii) Show that AB and OP are perpendicular. [2]

Ð→ (iii) The vector c is a unit vector in the direction of OP . Write c as a column vector, and give the geometrical meaning of ∣a ⋅ c∣. [2]

Ð→ (iv) Find a × p, where p is the vector OP , and give the geometrical meaning of ∣a × p∣. Hence write down the area of the triangle OAP . [4]

Exercise 349. (9740 N2008/I/3. Answer on p. 1189.) Skip part (iii) of this question if Ð→ you’re taking the 9758 (revised) exam. Points O, A, B are such that OA = i + 4j − 3k and Ð→ OB = 5i − j, and the point P is such that OAP B is a parallelogram. Ð→ (i) Find OP . [1]

(ii) Find the size of angle AOB. [3] (iii) Find the exact area of the parallelogram OAP B. [2]

Page 779, Table of Contents

www.EconsPhDTutor.com

Exercise 350. (9740 N2008/I/11. Answer on p. 1189.) Skip this question if you’re taking the 9758 (revised) exam. The equations of three planes p1 , p2 , p3 are 2x−5y+3z = 3, 3x+2y−5z = −5, 5x+λy+17z = µ, respectively, where λ and µare constants. When λ = −20.9 and µ = 16.6, find the coordinates of the point at which these planes meet. [2] The planes p1 and p2 intersect in a line l. (i) Find a vector equation of l. [4] (ii) Given that all three planes meet in the line l, find λ and µ. [3] (iii) Given instead that the three planes have no points in common, what can be said about the values of λ and µ? [2] (iv) Find the cartesian equation of the plane which contains l and the point (1, −1, 3). [4] Exercise 351. (9233 N2008/I/11. Answer on p. 1190.) The cartesian equations of two lines are x y+2 z−5 = = 1 2 −1

and

x−1 y+3 z−4 = = . −1 −3 1

(i) Show that the lines intersect and state the point of intersection. [5] (ii) Find the acute angle between the lines. [4]

Exercise 352. (9740 N2007/I/6. Answer on p. 1190.) Skip part (iii) of this question if you’re taking the 9758 (revised) exam. Referred to the origin O, the position vectors of the points A and B are i − j + 2k and 2i + 4j + k respectively. (i) Show that OA is perpendicular to OB. [2]

(ii) Find the position vector of the point M on the line segment AB such that AM ∶ M B = 1 ∶ 2. [3]

(iii) The point C has position vector −4i + 2j + 2k. Use a vector product to find the exact area of triangle OAC. [4]

Page 780, Table of Contents

www.EconsPhDTutor.com

Exercise 353. (9740 N2007/I/8. Answer on p. 1191.) The line l passes through the points A and B with coordinates (1, 2, 4) and (−2, 3, 1) respectively. The plane p has equation 3x − y + 2z = 17. Find (i) the coordinates of the point of intersection of l and p, [5] (ii) the acute angle between l and p, [3] (iii) the perpendicular distance from A to p. [3]

Exercise 354. (9233 N2007/I/7. Answer on p. 1191.) The point P is the foot of the perpendicular from the point A(1, 3, −2) to the line given by x+3 y−8 z−3 = = . 2 2 3

Find the coordinates of P , and hence find the length of AP . [7] Exercise 355. (9233 N2007/II/2. Answer on p. 1192.) Referred to an origin O the position vectors of two points A and B are 4i + j + 3k and i − 3j + 4k respectively. Two other Ð→ Ð→ ÐÐ→ Ð→ points, C and D, are given by OC = 0.25OA and OD = 0.75OB. (i) Find a vector equation for the line AD. [2]

(ii) Find the position vector of the point of intersection of AD and BC. [5]

Exercise 356. (9233 N2006/I/14. Answer on p. 1192.) Skip part (ii) of this question if you’re taking the 9758 (revised) exam. The points A, B, C and D have position vectors i−2j+5k, i+3j, 10i+j+2k and −2i+4j+5k respectively, with respect to an origin O. The point P on AB is such that AP ∶ P B = λ ∶ 1−λ Ð→ Ð→ and the point Q on CD is such that CQ ∶ QD = µ ∶ 1 − µ. Find OP and OQ in terms of λ and µ respectively. [3] Given that P Q is perpendicular to both AB and CD, Ð→ (i) show that P Q = i + 2j + 2k, [7]

(ii) find the area of triangle ABQ. [2]

Page 781, Table of Contents

www.EconsPhDTutor.com

77

Past-Year Questions for Part IV: Complex Numbers

Exercise 357. (9740 N2015/I/9. Answer on p. 1193.) (a) The complex number w is such that w = a + ib, where a and b are non-zero real numbers. The complex conjugate of w is w2 denoted by w∗ . Given that ∗ is purely imaginary, find the possible values of w in terms w of a. [5] Skip part (b) of this question if you’re taking the 9758 (revised) exam. (b) The complex number z is such that z 5 = −32i.

(i) Find the modulus and argument of each of the possible values of z. [4] (ii) Two of these values are z1 and z2 , where 0.5π < arg z1 < π and −π < arg z2 < −0.5π. Find the exact value of arg(z1 − z2 ) in terms of π and show that ∣z1 − z2 ∣ = 4 sin(0.2π). [4] Exercise 358. (9740 N2014/I/5. Answer on p. 1194.) It is given that z = 1 + 2i. (i) Without using a calculator, find the values of z 2 and your working. [4]

1 in cartesian form x + iy, showing z3

q (ii) The real numbers p and q are such that pz 2 + 3 is real. Find, in terms of p, the value z q 2 of q and the value of pz + 3 . [3] z Exercise 359. (9740 N2014/II/4. Answer on p. 1195.) Skip this question if you’re taking the 9758 (revised) exam. (a) The complex number z satisfies ∣z + 5 − i∣ = 4.

(i) On an Argand diagram show the locus of z. [2]

(ii) The complex number z also satisfies ∣z − 6i∣ = ∣z + 10 + 4i∣. Find exactly the possible values of z, giving your answers in the form x + iy. [4] √ (b) It is given that w = 3 − i.

(i) Without using a calculator, find an exact expression for w6 . Give your answer in the form reiθ , where r > 0 and 0 ≤ θ < 2π. [3]

(ii) Without using a calculator, find the three smallest positive whole number values of n wn for which ∗ is a real number. [4] w Page 782, Table of Contents

www.EconsPhDTutor.com

Exercise 360. (9740 N2013/I/4. Answer on p. 1196.) The complex number w is given by 1 + 2i.

(i) Find w3 in the form x + iy, showing your working. [2]

(ii) Given that w is a root of the equation az 3 + 5z 2 + 17z + b = 0, find the values of the real numbers a and b. [3] (iii) Using these values of a and b, find all the roots of this equation in exact form. [3]

Exercise 361. (9740 N2013/I/8. Answer on p. 1197.) The complex number z is given by z = reiθ , where r > 0 and 0 ≤ θ ≤ 0.5π. √ (i) Given that w = (1 − i 3) z, find ∣w∣ in terms of r and arg w in terms of θ. [2] Skip parts (ii) and (iii) of this question if you’re taking the 9758 (revised) exam.

(ii) Given that r has a fixed value, draw an Argand diagram to show the locus of z as θ varies. On the same diagram, show the corresponding locus of w. You should identify the modulus and argument of the endpoints of each locus. [4] (iii) Given that arg

z 10 = π, find θ. [3] w2

Exercise 362. (9740 N2012/I/6. Answer on p. 1198.) Do not use a calculator in answering this question. The complex number z is given by z = 1 + ic, where c is a non-zero real number. (i) Find z 3 in the form x + iy. [2]

(ii) Given that z 3 is real, find the possible values of z. [2] Skip part (iii) of this question if you’re taking the 9758 (revised) exam.

(iii) For the value of z found in part (ii) for which c < 0, find the smallest positive integer n such that ∣z n ∣ > 1000. State the modulus and argument of z n when n takes this value. [4] Exercise 363. (9740 N2012/II/2. Answer on p. 1199.) Skip this question if you’re taking the 9758 (revised) exam. The complex number z satisfies the equation ∣z − (7 − 3i)∣ = 4.

(i) Sketch an Argand diagram to illustrate this equation. [2] (ii) Given that ∣z∣ is as small as possible, (a) find the exact value of ∣z∣, [2]

(b) hence find an exact expression for z, in the form x + iy. [2]

(iii) It is given instead that −π < arg z ≤ π and that ∣arg z∣ is as large as possible. Find the value of arg z in radiant, correct to 4 significant figures. [3] Page 783, Table of Contents

www.EconsPhDTutor.com

Exercise 364. (9740 N2011/I/10. Answer on p. 1200.) Do not use a graphic calculator in answering this question. (i) The roots of the equation z 2 = −8i are z1 and z2 . Find z1 and z2 in cartesian form x + iy, showing your working. [4]

(ii) Hence, or otherwise, find in cartesian form the roots w1 and w2 of the equation w2 + 4w + (4 + 2i) = 0. [3] Skip parts (iii) and (iv) of this question if you’re taking the 9758 (revised) exam.

(iii) Using a single Argand diagram, sketch the loci (a) ∣z − z1 ∣ = ∣z − z2 ∣, [1]

(b) ∣z − w1 ∣ = ∣z − w1 ∣, [1]

(iv) Give a reason why there are no points which lie on both of these loci. [1]

Exercise 365. (9740 N2011/II/1. Answer on p. 1201.) Skip this question if you’re taking the 9758 (revised) exam. The complex number z satisfies ∣z − 2 − 5i∣ ≤ 3.

(i) On an Argand diagram, sketch the region in which the point representing z can lie. [3] (ii) Find exactly the maximum and minimum possible values of ∣z∣. [2]

(iii) It is given that 0 ≤ arg z ≤

π . With this extra information, find the maximum value 4 of ∣z − 6 − i∣. Label the point(s) that correspond to this maximum value on your diagram with the letter P . [3]

Page 784, Table of Contents

www.EconsPhDTutor.com

Exercise 366. (9740 √ N2010/I/8. Answer on p. 1202.) The complex numbers z1 and z2 are given by 1 + i 3 and −1 − i respectively.

(i) Express each of z1 and z2 in polar form r(cos θ + i sin θ), where r > 0 and −π < θ ≤ π. Give r and θ in exact form. [2] z1 (ii) Find the complex conjugate of in exact polar form. [3] z2 Skip parts (iii) and (iv) of this question if you’re taking the 9758 (revised) exam. (iii) On a single Argand diagram, sketch the loci (a) ∣z − z1 ∣ = 2,

(b) arg(z − z2 ) =

π . [4] 4

(iv) Find where the locus ∣z − z1 ∣ = 2 meets the positive real axis. [2] Exercise 367. (9740 N2010/II/1. Answer on p. 1203.) (i) Solve the equation x2 −6x+34 = 0. [2]

(ii) One root of the equation x4 + 4x3 + x2 + ax + b = 0, where a and b are real, is x = −2 + i. Find the values of a and b and the other roots. [5] Exercise 368. (9740 N2009/I/9. Answer on p. 1204.) (i) Solve the equation z 7 −(1+i) = 0, giving the roots in the form reiα , where r > 0 and −π < α ≤ π. [5] (ii) Show the roots on an Argand diagram. [2]

Skip part (iii) of this question if you’re taking the 9758 (revised) exam. (iii) The roots represented by z1 and z2 are such that 0 < arg z1 < arg z2 <

π . Explain why 2 the locus of all points z such that ∣z − z1 ∣ = ∣z − z2 ∣ passes through the origin. Draw this locus on your Argand diagram and find its exact cartesian equation. [5]

Exercise 369. (9740 N2008/I/8. Answer on p. 1205.) A graphic calculator is not to be used in answering this question. √ (i) It is given that z1 = 1 + 3i. Find the value of z13 , showing clearly how you obtain your answer. [3] √ (ii) Given that 1 + 3i is a root of the equation 2z 3 + az 2 + bz + 4 = 0, find the values of the real numbers a and b. [4]

(iii) For these values a and b, solve the equation in part (ii), and show all the roots on an Argand diagram. [4] Page 785, Table of Contents

www.EconsPhDTutor.com

Exercise 370. (9740 N2008/II/3. Answer on p. 1206.) (a) The complex number w has π modulus r and argument θ, where 0 < θ < , and w∗ denotes the conjugate of w. State the 2 w modulus and argument of p, where p = ∗ . [2] w Given that p5 is real and positive, find the possible values of θ. [2]

Skip part (b) of this question if you’re taking the 9758 (revised) exam. (b) The complex number z satisfies the relations ∣z∣ ≤ 6 and ∣z∣ = ∣z − 8 − 6i∣. (i) Illustrate both of these relations on a single Argand diagram [3]

(ii) Find the greatest and least possible values of arg z, giving your answers in radiant correct to 3 decimal places. [4]

Exercise 371. (9233 N2008/I/9. Answer on p. 1207.) Skip this question if you’re taking the 9758 (revised) exam. In an Argand diagram, the point P represents the complex number z. Clearly labelling any relevant points, draw three separate diagrams to show the locus of P in each of the following cases. (i) ∣z + 2i∣ = 2. [2]

(ii) ∣z − 2 − i∣ = ∣z − i∣. [2] (iii)

π π ≤ arg(z + 1 − 3i) ≤ . [4] 6 3

Exercise 372. (9233 N2008/II/3. Answer on p. 1209.) (i) Verify that w = 1 − i satisfies the equation w2 = −2i and write down the other root of this equation. [3] (ii) Use the quadratic formula to solve the equation z 2 − (3 + 5i)z − 4(1 − 2i) = 0. [4]

Exercise 373. (9740 N2007/I/3. Answer on p. 1210.) If you’re taking the 9758 (revised) exam, you can skip part (a) but you should still do part (b) of this question . (a) Sketch, on an Argand √ diagram, the locus of points representing the complex number z such that ∣z + 2 − 3i∣ = 13. [3]

(b) The complex number w is such that ww∗+2w = 3+4i, where w∗ is the complex conjugate of w. Find w in the form a + ib, where a and b are real. [4]

Page 786, Table of Contents

www.EconsPhDTutor.com

Exercise 374. (9740 N2007/I/7. Answer on p. 1211.) The polynomial P (z) has real coefficients. The equation P (z) = 0 has a root reiθ , where r > 0 and 0 < θ < π.

(i) Write down a second root in terms of r and θ, and hence show that a quadratic factor of P (z) is z 2 − 2rz cos θ + r2 . [3]

Skip parts (ii) and (iii) of this question if you’re taking the 9758 (revised) exam.

(ii) Solve the equation z 6 = −64, expressing the solutions in the form reiθ , where r > 0 and −π < θ ≤ π. [4]

(iii) Hence, or otherwise, express z 6 + 64 as the product of three quadratic factors with real coefficients, giving each factor in non-trigonometrical form. [3]

Exercise 375. (9233 N2007/I/9. Answer on p. 1212.) (i) The equation az 4 + bz 3 + cz 2 + dz + e = 0 has a root z = ki, where k is real and non-zero. Given that the coefficients a, b, c, d and e are real, show that ad2 + b2 e = bcd. [5]

(ii) Verify that this condition is satisfied for the equation z 4 + 3z 3 + 13z 2 + 27z + 36 = 0 and hence find two roots of this equation which are of the form z = ki, where k is real. [3] Exercise 376. (9233 N2007/II/5. Answer on p. 1213.) Skip this question if you’re taking the 9758 (revised) exam.

Illustrate, on an Argand diagram, the locus of a point P representing the complex number π z, where arg(z − 2i) = . [3] 3 Illustrate, using the same Argand diagram, the locus of a point Q representing the complex number z, where ∣z − 4∣ = ∣z + 2∣. [2] π Hence find the exact value of z such that arg(z − 2i) = and ∣z − 4∣ = ∣z + 2∣, giving your 3 answer in the form a + ib. [3] √ ∗ Show that, in this case, zz = 8 + 4 3. [2] Exercise 377. (9233 N2006/I/5. Answer on p. 1214.) Skip this question if you’re taking the 9758 (revised) exam. The complex number z satisfies ∣z + 4 − 4i∣ = 3.

(i) Describe, with the aid of a sketch, the locus of the point which represents z in an Argand diagram. [3] (ii) Find the least possible value of ∣z − i∣. [2]

Exercise 378. (9233 N2006/I/6. Answer on p. 1215.) (i) Show that the equation z 4 − 2z 3 + 6z 2 − 8z + 8 = 0 has a root of the form ki where k is real. [3]

(ii) Hence solve the equation z 4 − 2z 3 + 6z 2 − 8z + 8 = 0. [3] Page 787, Table of Contents

www.EconsPhDTutor.com

78

Past-Year Questions for Part V: Calculus

Exercise 379. (9740 N2015/I/3. Answer on p. 1216.) (i) Given that f is a continuous function, explain, with the aid of a sketch, why the value of 1 1 2 n [f ( ) + f ( ) + ⋅ ⋅ ⋅ + f ( )] n→∞ n n n n lim

1

is ∫ f (x) dx. 0

√ √ √ 1 3 1 + 3 2 + ⋅⋅⋅ + 3 n √ (ii) Hence evaluate lim ( ). 3 n→∞ n n Exercise 380. (9740 N2015/I/4. Answer on p. 1217.) A piece of wire of fixed length d m is cut into two parts. One part is bent into the shape of a rectangle with sides of length x m and y m. The other part is bent into the shape of a semicircle, including its diameter. The radius of the semicircle is x m. Show that the maximum value of the total area of the two shapes can be expressed as kd2 m2 , where k is a constant to be found.

Exercise 381. (9740 N2015/I/6. Answer on p. 1217.) Write down the first three nonzero terms in the Maclaurin series for ln (1 + 2x), where −0.5 < x ≤ 0.5, simplifying the coefficients. [2]

(ii) It is given that the three terms found in part (i) are equal to the first three terms in c the series expansion of ax (1 + bx) for small x. Find the exact values of the constants a, c b and c and use these values to find the coefficient of x4 in the expansion of ax (1 + bx) , giving your answer as a simplified rational number. [6]

Page 788, Table of Contents

www.EconsPhDTutor.com

Exercise 382. (9740 N2015/I/10. Answer on p. 1218.) With origin O, the curves with equations y = sin x and y = cos x, where 0 ≤ x ≤ 0.5π, meet at the point P with coordinates √ π 2 ( , ). The area of the region bounded by the curves and the x-axis is A1 and the area 4 2 of the region bounded by the curves and the y-axis is A2 (see diagram).

y y = sin x P A2 y = cos x A1 x O (i) Show that

0.5π A1 √ = 2. [4] A2

√ (ii) The region bounded by y = sin x between O and P , the line y = 0.5 2 and the y-axis is rotated about the y-axis through 360○ . Show that the volume of the solid formed is given by π∫

0

√ 0.5 2

(sin−1 y) dy. [2] 2

(iii) Show that the substitution y = sin u transforms the integral in part (ii) to π ∫ for limits a and b to be determined. Hence find the exact volume. [6]

Page 789, Table of Contents

b a

u2 cos u du,

www.EconsPhDTutor.com

Exercise 383. (9740 N2015/I/11. Answer on p. 1219.) A curve C has parametric equations x = sin3 θ, y = 3 sin2 θ cos θ, for 0 ≤ θ ≤ 0.5π.

(i) Show that

dy = 2 cot θ − tan θ. [3] dx

√ (ii) Show that C has a turning point when tan θ = k, where k is an integer to be determined. Find, in non-trigonometric form, the exact coordinates of the turning point and explain why it is a maximum. [6] (iii) Show that the area of the region bounded by C and the x-axis is given by ∫0

0.5π

9 sin4 θ cos2 θ dθ.

Use your calculator to find the area, giving your answer correct to 3 decimal places. [3] The line with equation y = ax, where a is a positive constant, meets C at the origin and at the point P . 3 at P . Find the exact value of a such that the line passes through a the maximum point of C. [3] (iv) Show that tan θ =

Exercise 384. (9740 N2015/II/1. Answer on p. 1219.) As a tree grows, the rate of increase of its height, h m, with respect to time, t years after planting, is modelled by the differential equation dh 1 √ = 16 − 0.5h. dt 10

The tree is planted as a seedling of negligible height, so that h = 0 when t = 0. (i) State the maximum height of the tree, according to this model. [1]

(ii) Find an expression for t in terms of h, and hence find the time the tree takes to reach half its maximum height. [5]

Exercise 385. (9740 N2014/I/2. Answer on p. 1220.) The curve C has equation x2 y + xy 2 + 54 = 0. Without using a calculator, find the coordinates of the point on C at which the gradient is −1, showing that there is only one such point. [6]

Page 790, Table of Contents

www.EconsPhDTutor.com

Exercise 386. (9740 N2014/I/7. Answer on p. 1220.) It is given that f (x) = x6 − 3x4 − 7. The diagram shows the curve with equation y = f (x) and the line with equation y = −7, for x ≥ 0. The curve crosses the positive x-axis at x = α, and the curve and the line meet where x = 0 and x = β.

y

x Ƚ

O

-7 (Ⱦ, -7)

(i) Find the value of α, giving your answer correct to 3 decimal places, and find the exact value of β. [2] (ii) Evaluate ∫ f (x) dx, giving your answer correct to 3 decimal places. [2] β α

(iii) Find, in terms of for x ≥ 0. [3]

√ 3, the area of the finite region bounded by the curve and the line,

(iv) Show that f (x) = f (−x). What can be said about the six roots of the equation f (x) = 0? [4]

1 Exercise 387. (9740 N2014/I/8. Answer on p. 1221.) It is given that f (x) = √ , 9 − x2 where −3 < x < 3. (i) Write down ∫ f (x) dx. [1]

(ii) Find the binomial expansion for f (x), up to and including the term in x6 . Give the coefficients as exact fractions in their simplest form. [4] (iii) Hence, or otherwise, find the first four non-zero terms of the Maclaurin series for x sin−1 . Give the coefficients as exact fractions in their simplest form. [4] 3

Page 791, Table of Contents

www.EconsPhDTutor.com

Exercise 388. (9740 N2014/I/10. Answer on p. 1222.) The mass, x grams, of a certain substance present in a chemical reaction at time t minutes satisfies the differential equation dx = k (1 + x − x2 ) , dt

where 0 ≤ x ≤ 0.5 and k is a constant. It is given that x = 0.5 and

(i) Show that k = −0.2. [1]

dx = −0.25 when t = 0. dt

(ii) By first expressing 1 + x − x2 in completed square form, find t in terms of x. [5] (iii) Hence find

(a) the exact time taken for the mass of the substance present in the chemical reaction to become half of its initial value, [1] (b) the time taken for there to be none of the substance present in the chemical reaction, giving your answer correct to 3 decimal places. [1] (iv) Express the solution of the differential equation in the form x = f (t) and sketch the part of the curve with this equation which is relevant in this context. [5]

Page 792, Table of Contents

www.EconsPhDTutor.com

Exercise 389. (9740 N2014/I/11. Answer on p. 1223.) [It is given that the volume of a 4 sphere of radius r is πr3 and the volume of a circular cone with base radius r and height 3 1 2 h is πr h.] 3

4 h

r

A toy manufacturer makes a toy which consists of a hemisphere of radius r cm joined to a circular cone of base radius r cm and height h cm (see diagram). The manufacturer determines that the length of the slant edge of the cone must be 4 cm and that the total volume of the toy, V cm3 , should be as large as possible. (i) Find a formula for V in terms of r. Given that r = r1 is the value of r which gives the maximum value of V , show that r1 satisfies the equation 45r4 − 768r2 + 1024 = 0. [6]

(ii) Find the two solutions to the equation in part (i) for which r > 0, giving your answers correct to 3 decimal places. [2]

(iii) Show that one of the solutions found in part (ii) does not give a stationary value of V . Hence write down the value of r1 and find the corresponding value of h. [3] (iv) Sketch the graph showing the volume of the toy as the radius of the hemisphere varies. [3]

Page 793, Table of Contents

www.EconsPhDTutor.com

Exercise 390. (9740 N2014/II/2. Answer on p. 1224.) Using partial fractions, find ∫0

2

9x2 + x − 13 dx. (2x − 5)(x2 + 9)

Give your answer in the form a ln b + c tan−1 d, where a, b, c and d are rational numbers to be determined. [9] Exercise 391. (9740 N2013/I/5. Answer on p. 1225.) It is given that √ ⎧ ⎪ x2 ⎪ ⎪ ⎪ 1− 2 f (x) = ⎨ a ⎪ ⎪ ⎪ ⎪ ⎩0

for − a ≤ x ≤ a, for a < x < 2a,

and that f (x + 3a) = f (x) for all real values of x, where a is a real constant. (i) Sketch the graph of y = f (x) for −4a ≤ x ≤ 6a. [3]

√

(ii) Use the substitution x = a sin θ to find the exact value of ∫ a/2 and π. [5]

3a/2

f (x) dx in terms of a

Exercise 392. (9740 N2013/I/10. Answer on p. 1226.) The variables x, y and z are dz A dy B connected by the following differential equations. = 3 − 2z and = z. dx dx (i) Given that z < 1.5, solve equation = to find z in terms of x. A

(ii) Hence find y in terms of x.

(iii) Use the result in part (ii) to show that determined . [3]

d2 y dy = a + b, for constants a and b to be dx2 dx

You can skip part (iv) if you’re taking the 9758 (revised) exam.

(iv) The result in part (ii) represents a family of curves. Some members of the family are straight lines. Write down the equations of two of these lines. On a single diagram, sketch one of your lines together with a non-linear member of the family of curves that has your line as an asymptote. [4]

Page 794, Table of Contents

www.EconsPhDTutor.com

Exercise 393. (9740 N2013/I/11. Answer on p. 1227.) A curve C has parametric equations x = 3t2 , y = 2t3 . (i) Find the equation of the tangent to C at the point with parameter t. [3]

(ii) Points P and Q on C have parameters p and q respectively. The tangent at P meets the tangent at Q at the point R. Show that the x-coordinate of R is p2 + pq + q 2 , and find the y-coordinate of R in terms of p and q. Given that pq = −1, show that R lies on the curve with equation x = y 2 + 1. [5]

y

C

L M

x

A curve L has equation x = y 2 + 1. The diagram shows the parts of C and L for which y ≥ 0. The curves C and L touch at the point M .

(iii) Show that 4t6 − 3t2 + 1 = 0 at M . Hence, or otherwise, find the exact coordinates of M . [3]

(iv) Find the exact value of the area of the shaded region bounded by C and L for which y ≥ 0. [6] Page 795, Table of Contents

www.EconsPhDTutor.com

Exercise 394. (9740 N2013/II/2. Answer on p. 1228.) Fig. 1 shows a piece of card, ABC, in the form of an equilateral triangle of side a. A kite shape is cut from each corner, to give the shape shown in Fig. 2. The remaining card shown in Fig. 2 is folded along the dotted lines, to form the open triangular prism of height x shown in Fig. 3.

A

x

x

a x

x

B

A

Fig. 2

x

x Fig. 1

C

Fig. 3

√ √ 2 (i) Show that the volume V of the prism is given by V = 0.25x 3 (a − 2x 3) . [3]

(ii) Use differentiation to find, in terms of a, the maximum value of V , proving that it is a maximum. [6]

Author’s remark: The question should have clearly stated that either a or x is a fixed constant. Otherwise the volume V of the prism is unbounded — simply blow up both a and x to ∞! My guess is that the writers of this question intended a to be the fixed constant.

Exercise 395. (9740 N2013/II/3. Answer on p. 1229.) (i) Given that f (x) = ln(1+2 sin x), find f (0), f ′ (0), f ′′ (0) and f ′′′ (0). Hence write down the first three non-zero terms in the Maclaurin series for f (x). [7]

(ii) The first two non-zero terms in the Maclaurin series for f (x) are equal to the first two non-zero terms in the series expansion of eax sin nx. Using appropriate expansions from the List of Formulae (MF15), find the constants a and n. Hence find the third non-zero term of the series expansion of eax sin nx for these values of a and n. [5]

Page 796, Table of Contents

www.EconsPhDTutor.com

x3 Exercise 396. (9740 N2012/I/2. Answer on p. 1230.) (i) Find ∫ dx. [2] 1 + x4 x (ii) Use the substitution u = x2 to find ∫ dx. [3] 1 + x4 (iii) Evaluate ∫ 0

1

2 x ( ) dx, giving the answer correct to 3 decimal places. [1] 1 + x4

Exercise 397. (9740 N2012/I/4. Answer on p. 1230.) In the triangle ABC, AB = 1, ∠BAC = θ radiant and ∠ABC = 0.75π (see diagram).

B 1 0.75π ȣ C

A (i) Show that AC =

1 . [4] cos θ − sin θ

(ii) Given that θ is a sufficiently small angle, show that AC ≈ 1 + aθ + bθ2 , for constants a and b to be determined. [4] Exercise 398. (9740 N2012/I/8. Answer on p. 1231.) The curve C has equation x − y = (x + y)2 . It is given that C has only one turning point. (i) Show that 1 +

dy 2 = . [4] dx 2x + 2y + 1

d2 y dy 3 (ii) Hence, or otherwise, show that 2 = − (1 + ) . [3] dx dx

(iii) Hence, state, with a reason, whether the turning point is a maximum or a minimum. [2]

Page 797, Table of Contents

www.EconsPhDTutor.com

Exercise 399. (9740 N2012/I/10. Answer on p. 1232.) [It is given that a sphere of radius 4 r has surface area 4πr2 and volume πr3 .] 3 A model of a concert hall is made up of three parts. The roof is modelled by the curved surface of a hemisphere of radius r cm. The walls are modelled by the curved surface of a cylinder of radius r cm and height h cm. The floor is modelled by a circular disc of radius r cm. The three parts are joined together as shown in the diagram. The model is made of material of negligible thickness. h

r

(i) It is given that the volume of the model is a fixed value k cm3 , and the external surface area is a minimum. Use differentiation to find the values of r and h in terms of k. Simplify your answers. [7] (ii) It is given instead that the volume of the model is 200 cm3 and its external surface area is 180 cm2 . Show that there are two possible values of r. Given also that r < h, find the value of r and the value of h. [5] Exercise 400. (9740 N2012/I/11. Answer on p. 1233.) A curve C has parametric equations x = θ − sin θ, y = 1 − cos θ where 0 ≤ θ ≤ 2π. dy = cot (0.5θ) and find the gradient of C at the point where θ = π. What dx can be said about the tangents to C as θ → 0 and θ → 2π? [5]

(i) Show that

(ii) Sketch C, showing clearly the features of the curve at the points where θ = 0, π and 2π. [3]

(iii) Without using a calculator, find the exact area of the region bounded by C and the x-axis. [5] (iv) A point P on C has parameter p, where 0 < p < 0.5π. Show that the normal to C at P crosses the x-axis at the point with coordinates (p, 0). [3]

Page 798, Table of Contents

www.EconsPhDTutor.com

Exercise 401. (9740 N2012/II/1. Answer on p. 1234.) (a) Find the general solution of d2 y the differential equation 2 = 16 − 9x2 , giving your answer in the form y = f (x). [3] dx

du = 16 − 9u2 , and that u = 1 when t = 0, find t in dt terms of u, simplifying your answer. [5] (b) Given that u and t are related by

Exercise 402. (9740 N2011/I/3. Answer on p. 1234.) The parametric equations of a 2 curve are x = t2 , y = . t 2 (i) Find the equation of the tangent to the curve at the point (p2 , ), simplifying your p answer. [2] (ii) Hence find the coordinates of the points Q and R where this tangent meets the x- and y-axes respectively. [2] (iii) Find a cartesian equation of the locus of the mid-point of QR as p varies.

Exercise 403. (9740 N2011/I/4. Answer on p. 1235.) (i) Use the first three non-zero terms of the Maclaurin series for cos x to find the Maclaurin series for g(x), where g(x) = cos6 x, up to and including the term in x4 . [3] a

(ii) (a) Use your answer to part (i) to give an approximation for ∫ g(x) dx in terms of a, 0 π and evaluate this approximation in the case where a = . [3] 4

Exercise 404. (9740 N2011/I/5. Answer on p. 1236.) It is given that f (x) = 2 − x.

(i) On separate diagrams, sketch the graphs of y = f (∣x∣) and y = ∣f (x)∣, giving the coordinates of any points where the graphs meet the x- and y-axes. You should label the graphs clearly. [3] (ii) State the set of values of x for which f (∣x∣) = ∣f (x)∣.

1

(iii) Find the exact value of the constant a for which ∫ f (∣x∣) dx = ∫ ∣f (x)∣ dx. [3] −1 1

Page 799, Table of Contents

a

www.EconsPhDTutor.com

−1

Exercise 405. (9740 N2011/I/8. Answer on p. 1237.) (i) Find ∫ (100 − v 2 )

dv.

(ii) A stone is dropped from a stationary balloon. It leaves the balloon with zero speed, and dv = 10−0.1v 2 . t seconds later it speed v metres per second satisfies the differential equation dt

(a) Find t in terms of v. Hence find the exact time the stone takes to reach a speed of 5 metres per second. [5] (b) Find the speed of the stone after 1 second. [3] (c) What happens to the speed of the stone for large values of t?

Exercise 406. (9740 N2011/II/2. Answer on p. 1237.) The diagram shows a rectangular piece of cardboard ABCD of sides n metres and 2n metres, where n is a positive constant. A square of side x metres is removed from each corner of ABCD. The remaining shape is now folded along P Q, QR, RS and SP to form an open rectangular box of height x metres.

x

A

B x

P

Q

S

R

n

D

2n

C

(i) Show that the volume V cubic metres of the box is given by V = 2n2 x − 6nx2 + 4x3 .

(ii) Without using a calculator, find in surd form the value of x that gives a stationary value of V , and explain why there is only one answer. [6]

Page 800, Table of Contents

www.EconsPhDTutor.com

Exercise 407. (9740 N2011/II/4. Answer on p. 1238.) (a) (i) Obtain a formula for 2 −2x ∫0 x e dx in terms of n, where n > 0. [5] n

(ii) Hence evaluate ∫ 0

∞

x2 e−2x dx. [1]

4x (b) The region bounded by the curve y = 2 , the axis and the lines x = 0 and x = 1 is x +1 rotated through 2π radiant about the x-axis. Use the substitution x = tan θ to show that

the volume of the solid obtained is given by 16π ∫ 0 exactly. [6]

π/4

sin2 θ dθ, and evaluate this integral

Exercise 408. (9740 N2010/I/2. Answer on p. 1238.) (i) Find the first three terms of the Maclaurin series for ex (1 + sin 2x). [You may use standard results given in the List of Formulae (MF15).] [3] (ii) It is given that the first two terms of this series are equal to the first two terms in the 4 n series expansion, in ascending powers of x, of (1 + x) . Find n and show that the third 3 terms in each of these series are equal. [3]

Exercise 409. (9740 N2010/I/4. Answer on p. 1239.) (i) Given that x2 − y 2 + 2xy + 4 = 0, dy find in terms of x and y. [4] dx

(ii) For the curve with equation x2 − y 2 + 2xy + 4 = 0, find the coordinates of each point at which the tangent is parallel to the x-axis. [4]

Page 801, Table of Contents

www.EconsPhDTutor.com

Exercise 410. (9740 N2010/I/6. Answer on p. 1239.) The diagram shows the curve with equation y = x3 − 3x + 1 and the line with equation y = 1. The curve crosses the x-axis at x = α, x = β and x = γ and has turning points x = −1 and x = 1. y

1

x Ƚ

-1

O

Ⱦ

1

ɀ

(i) Find the values of β and γ, giving your answers correct to 3 decimal places. [2] (ii) Find the area of the region bounded by the curve and the x-axis between x = β and x = γ. [2]

(iii) Use a non-calculator method to find the area of the region bounded by the curve and the red line, where x ≤ 0. [4]

(iv) Find the set of values of k for which the equation x3 − 3x + 1 = k has three real distinct roots. [2]

Exercise 411. (9740 N2010/I/7. Answer on p. 1240.) (i) A bottle containing liquid is taken from a refrigerator and placed in a room where the temperature is a constant 20 ○ C. As the liquid warms up, the rate of increase of its temperature θ ○ C after time t minutes is proportional to the temperature difference (20 − θ) ○ C. Initially the temperature of the liquid is 10 ○ C and the rate of increase of the temperature is 1 ○ C per minute. By setting up and solving a differential equation, show that θ = 20 − 10e−0.1t . [7]

(ii) Find the time it takes the liquid to reach a temperature of 15 ○ C, and state what happens to θ for large values of t. Sketch a graph of θ against t. [4]

Page 802, Table of Contents

www.EconsPhDTutor.com

Exercise 412. (9740 N2010/I/9. Answer on p. 1241.) A company requires a box made of cardboard of negligible thickness to hold 300 cm3 of powder when full. The length of the box is 3x cm, the width is x cm and the height is y cm. The lid has depth ky cm, where 0 < k ≤ 1 (see diagram).

3x

y

3x

ky x

x

Box

Lid

(i) Use differentiation to find, in terms of k, the value of x which gives a minimum total external surface area of the box and the lid. [6] Author’s remark: It is not clear and so I interpret “external surface area” to mean the external surface area when the lid is placed over the box. Another possible and entirely reasonable interpretation is that this refers to the total external surface area when the box and lid are kept apart, as depicted in the diagram! This would yield a different answer. (ii) Find also the ratio of the height to the width, [2] (iii) Find the values between which

y , in this case, simplifying your answer. x

y must lie. [2] x

(iv) Find the value of k for which the box has square ends. [2]

Page 803, Table of Contents

www.EconsPhDTutor.com

Exercise 413. (9740 N2010/I/11. Answer on p. 1242.) A curve C has parametric 1 1 equations x = t + , y = t − . t t

(i) The point P on the curve has parameter p. Show that the equation of the tangent at P is (p2 + 1) x − (p2 − 1)y = 4p. [4]

(ii) The tangent at P meets the line y = x at the point A and the line y = −x at the point B. Show that the area of triangle OAB is independent of p, where O is the origin. [4]

(iii) Find a cartesian equation of C. Sketch C, giving the coordinates of any points where C crosses the x- and y-axes and the equations of any asymptotes. [4] √ Exercise 414. (9740 N2010/II/3. Answer on p. 1243.) (i) Given that y = x x + 2, find dy , expressing your answer as a single algebraic fraction. Hence, show that there is only dx √ one value of x for which the curve y = x x + 2 has a turning point, and state this value. (ii) A curve has equation y 2 = x2 (x + 2).

(a) Find exactly the possible values of the gradient at the point where x = 0. [2]

(b) Sketch the curve y 2 = x2 (x + 2).

√ (iii) On a separate diagram sketch the graph of y = f ′ (x), where f (x) = x x + 2. State the equations of any asymptotes. [2] Exercise 415. (9740 N2009/I/2. Answer on p. 1245.) Find the exact value of p such that 1 1 2p 1 1 dx. [5] ∫0 4 − x2 dx = ∫0 √ 1 − p2 x 2

Exercise 416. (9740 N2009/I/4. Answer on p. 1245.) It is given that ⎧ ⎪ ⎪ ⎪7 − x2 f (x) = ⎨ ⎪ ⎪ ⎪ ⎩2x − 1

for 0 < x ≤ 2,

for 2 < x ≤ 4,

and that f (x) = f (x + 4) for all real values of x. (i) Evaluate f (27) + f (45). [2]

(ii) Sketch the graph of y = f (x) for −7 ≤ x ≤ 10. [3] 3

(iii) Find ∫ f (x) dx. [3] −4

Page 804, Table of Contents

www.EconsPhDTutor.com

Exercise 417. (9740 N2009/I/7. Answer on p. 1246.) (i) Given that f (x) = ecos x , find f (0), f ′ (0) and f ′′ (0). Hence write down the first two non-zero terms in the Maclaurin series for f (x). Give the coefficients in terms of e. [5] (ii) Given that the first two non-zero terms in the Maclaurin series for f (x) are equal to 1 the first two non-zero terms in the series expansion of , where a and b are constants, a + bx2 find a and b in terms of e. [4]

Exercise 418. (9740 N2009/I/11. Answer on p. 1246.) The curve C has equation 2 y = f (x), where f (x) = xe−x . (i) Sketch the curve C. [2]

(ii) Find the exact coordinates of the turning points on the curve. [4] (iii) Use the substitution u = x2 to find ∫ f (x) dx, for n > 0. Hence find the area of the 0 region between the curve and the positive x-axis. [4] n

2

(iv) Find the exact value of ∫ ∣f (x)∣ dx. −2

(v) Find the volume of revolution when the region bounded by the curve, the lines x = 0, x = 1 and the x-axis is rotated completely about the x-axis. Give your answer correct to 3 significant figures. [2] Exercise 419. (9740 N2009/II/1. Answer on p. 1247.) The curve C has parametric equations x = t2 + 4t, y = t3 + t2 .

(i) Sketch the curve for −2 ≤ t ≤ 1. [1]

The tangent to the curve at the point P where t = 2 is denoted by l.

(ii) Find the cartesian equation of l. [3]

(iii) The tangent l meets C again at the point Q. Use a non-calculator method to find the coordinates of Q. [4]

Page 805, Table of Contents

www.EconsPhDTutor.com

Exercise 420. (9740 N2009/II/4. Answer on p. 1248.) Two scientists are investigating the change of a certain population of size n thousand at time t years. d2 n (i) One scientist suggests that n and t are related by the differential equation 2 = 10 − 6t. dt Find the general solution of this differential equation. Sketch three members of the family of solution curves, given that n = 100 when t = 0. [5] (You can skip the last sentence if you’re taking the 9758 (revised) exam.)

dn = (ii) The other scientist suggests that n and t are related by the differential equation dt 3 − 0.02n. Find n in terms of t, given again that n = 100 when t = 0. Explain in simple terms what will eventually happen to the population using this model. [7] Exercise 421. (9740 N2008/I/1. Answer on p. 1249.) The diagram shows the curve with equation y = x2 . The area of the region bounded by the curve, the lines x = 1, x = 2 and the x-axis is equal to the area of the region bounded by the curve, the lines y = a, y = 4 and the y-axis , where a < 4. Find the value of a. [4]

y

4

a

x 1

Page 806, Table of Contents

2

www.EconsPhDTutor.com

Exercise 422. (9740 N2008/I/4. Answer on p. 1249.) (i) Find the general solution of the dy 3x differential equation = 2 . [2] dx x + 1

(ii) Find the particular solution of the differential equation for which y = 2 when x = 0. [1]

(iii) What can you say about the gradient of every solution curve as x → ±∞? [1] You can skip the part (iv) if you’re taking the 9758 (revised) exam.

(iv) Sketch, on a single diagram, the graph of the solution found in part (ii), together with 2 other members of the family of solution curves. [3]

Exercise 423. (9740 N2008/I/5. Answer on p. 1250.) (i) Find the exact value of √ 1/ 3 1 dx. [3] ∫0 1 + 9x2 (ii) Find, in terms of n and e, ∫ xn ln x dx, where n ≠ −1. [4] 1 e

Exercise 424. (9740 N2008/I/6. Answer on p. 1250.) (a) In the triangle ABC, AB = 1, BC =√3 and ∠ABC = θ radiant. Given that θ is a sufficiently small angle, show that AC ≈ 4 + 3θ2 ≈ a + bθ2 , for constants a and b to be determined. [5] π (b) Given that f (x) = tan (2x + ), find f (0), f ′ (0) and f ′′ (0). Hence find the first 3 terms 4 in the Maclaurin series of f (x). [5]

Page 807, Table of Contents

www.EconsPhDTutor.com

Exercise 425. (9740 N2008/I/7. Answer on p. 1251.) A new flower-bed is being designed for a large garden. The flower-bed will occupy a rectangle x m by y m together with a semicircle of diameter x m, as shown in the diagram. A low wall will be built around the flower-bed. The time needed to build the wall will be 3 hours per metre for the straight parts and 9 hours per metre for the semicircular part. Given that a total time of 180 hours is taken to build the wall, find, using differentiation, the values of x and y which give a flower-bed of maximum area. [10]

y

y

x

Page 808, Table of Contents

www.EconsPhDTutor.com

Exercise 426. (9740 N2008/II/1. Answer on p. 1252.) Let f (x) = ex sin x. (i) Sketch the graph for y = f (x) for −3 ≤ x ≤ 3. [2]

(ii) Find the series expansion of f (x) in ascending powers of x, up to and including the term in x3 . [3] Denote the answer to part (ii) by g(x).

(iii) On the same diagram as in part (i), sketch the graph of y = g(x). Label the two graphs clearly. [1]

(iv) Find, for −3 ≤ x ≤ 3, the set of values of x for which the value of g(x) is within ±0.5 of the value of f (x). [3] Exercise 427. (9740√N2008/II/2. Answer on p. 1253.) The diagram shows the curve C with equation y 2 = x 1 − x. The region enclosed by C is denoted by R. y 0.5

x O

0.5

1

-0.5

(i) Write down an integral that gives the area of R, and evaluate this integral numerically. [3] (ii) The part of R above the x-axis is rotated through 2π radiant about the x-axis. By using the substitution u = 1 − x, or otherwise, find the exact value of the volume obtained. [3] (iii) Find the exact x-coordinate of the maximum point of C. [3] Page 809, Table of Contents

www.EconsPhDTutor.com

Exercise 428. (9233 N2008/I/2. Answer on p. 1253.) Find the constants a and b such cos 2x that, when x is small, √ ≈ a + bx2 . [4] 2 1+x

1 3 Exercise 429. (9233 N2008/I/3. Answer on p. 1253.) Show that ∫ xe−2x dx = − e−2 . 4 4 0 [5] 1

Exercise 430. (9233 N2008/I/4. Answer on p. 1253.) Use the substitution t = ln x to e3 1 find the value of ∫ dx. [6] x(ln x)2 e

Exercise 431. (9233 N2008/I/6. Answer on p. 1254.) (i) Given that 0 < a < b, sketch the graph of y = ∣x − a∣ for −b ≤ x ≤ b. [3] (ii) Find ∫ ∣x − a∣ dx. [2] −b b

Exercise 432. (9233 N2008/I/8. Answer on p. 1254.) Find the exact value of a for which ∫a

∞

√

3/2 1 1 √ dx = dx. [5] ∫ 2 4+x 1/2 1 − x2

Exercise 433. (9233 N2008/I/10. Answer on p. 1255.) (i) Prove that the substitution dy dz y = xz reduces the differential equation xy = x2 + y 2 to xz = 1. [3] dx dx dy (ii) Hence find the solution of the differential equation xy = x2 + y 2 f or which y = 6 when dx x = 2. [5] Exercise 434. (9233 N2008/I/13. Answer on p. 1255.) A curve is defined by the paraπ metric equations x = cos3 t, y = sin3 t, for 0 < t < . 4

(i) Show that the equation of the normal to the curve at the point P (cos3 t, sin3 t) is x cos t − y sin t = cos4 t − sin4 t. [5] (ii) Prove the identity cos4 t − sin4 t ≡ cos 2t. [2]

(iii) The normal at P meets the x-axis at A and the y-axis at B. Show that the length of AB can be expressed in the form k cot 2t, where k is a constant to be found. [5] Page 810, Table of Contents

www.EconsPhDTutor.com

Exercise 435. (9233 N2008/II/1. Answer on p. 1256.) Use the formulae for cos(A + B) and cos(A − B), with A = 5x and B = x, to show that 2 sin 5x sin x can be written as cos px − cos qx, where p and q are positive integers. [2] Hence find the exact value of ∫ 0

π/3

sin 5x sin x dx. [3]

Exercise 436. (9233 N2008/II/5. Answer on p. 1256.) (i) Show that the derivative of 2x the function ln(1 + x) − is never negative. [5] x+2 2x (ii) Hence show that ln(1 + x) ≥ when x ≥ 0. [3] x+2

Exercise 437. (9740 N2007/I/4. Answer on p. 1256.) The current I in an electric circuit dI at time t satisfies the differential equation 4 = 2 − 3I. dt (i) Find I in terms of t, given that I = 2 when t = 0. [6]

(ii) State what happens to the current in this circuit for large values of t. [1]

Exercise 438. (9740 N2007/I/11. Answer on p. 1257.) A curve has parametric equations x = cos2 t, y = sin3 t, for 0 ≤ t ≤ 0.5π. (i) Sketch the curve. [2]

(ii) The tangent to the curve at the point (cos2 θ, sin3 θ), where 0 < θ < 0.5π, meets the xand y-axes at Q and R respectively. The origin is denoted by O. Show that the area of △OQR is 1 2 sin θ (3 cos2 θ + 2 sin2 θ) . [6] 12

(iii) Show that the area under the curve for 0 ≤ t ≤ 0.5π is 2 ∫ 0 substitution sin t = u to find this area. [5]

0.5π

cos t sin4 t dt, and use the

Exercise 439. (9740 N2007/II/3. Answer on p. 1258.) (i) By successively differentiating (1 + x)n , find Maclaurin series for (1 + x)n , up to and including the term in x3 . [4] (ii) Obtain the expansion of (4 − x)1.5 (1 + 2x2 )

1.5

up to and including the term in x3 . [5]

(iii) Find the set of values of x for which the expansion in part (ii) is valid. [2] Page 811, Table of Contents

www.EconsPhDTutor.com

Exercise 440. (9740 N2007/II/4. Answer on p. 1259.) (i) Find the exact value of ∫0

5π/3

sin2 x dx. Hence find the exact value of ∫ 0

5π/3

cos2 x dx.

(ii) The region R is bounded by the curve y = x2 sin x, the line x = 0.5π and the part of the x-axis between 0 and 0.5π. Find (a) the exact area of R, [5]

(b) the numerical value of the volume of revolution formed when R is rotated completely about the x-axis, giving your answer correct to 3 decimal places. [2]

Exercise 441. (9233 N2007/I/2. Answer on p. 1259.) Find the first negative coefficient 4 in the expansion of (4 + 3x)2.5 in a series of ascending powers of x, where ∣x∣ < . Give your 3 answer as a fraction in its lowest terms. [3] Exercise 442. (9233 N2007/I/3. Answer on p. 1259.) The region bounded by the curve √ 1 , the x-axis and the lines x = 0.5 and x = 0.5 3 is rotated through 4 right y= √ 1 + 4x2 angles about the x-axis to form a solid of revolution of volume V . Find the exact value of V , giving your answer in the form kπ 2 . [5] Exercise 443. (9233 N2007/I/8. Answer on p. 1259.) Use the substitution t = sin u to 2 (sin−1 t) cos [(sin−1 t) ] √ dt simplifies to ∫ u cos u2 du. [3] show that ∫ 1 − t2 Hence evaluate ∫ 0

1

(sin−1 t) cos [(sin−1 t) ] √ dt. [4] 1 − t2 2

Exercise 444. (9233 N2007/I/10. Answer on p. 1260.) (i) By sketching the graphs of y = cos x and y = sin x, or otherwise, solve the inequality cos x > sin x for 0 ≤ x ≤ 2π. [3] (ii) Evaluate ∫ 0

2π

∣cos x − sin x∣ dx. [5]

Exercise 445. (9233 N2007/I/11. Answer on p. 1260.) Use partial fractions to evaluate 4 5x + 4 ∫1 (x − 5)(x2 + 4) dx, giving your answer in the form − ln a, where a is a positive integer. [9] Page 812, Table of Contents

www.EconsPhDTutor.com

Exercise 446. (9233 N2007/I/13. Answer on p. 1261.) In this question, the result d sec x = sec x tan x may be quoted without proof. Given that y = ln (sec x), show that dx

(i)

d3 y d2 y dy = 2 , [3] dx3 dx2 dx

(ii) the value of

d4 y when x = 0 is 2. [4] dx4

(iii) Write down the Maclaurin series for ln(sec x) up to and including the term in x4 . [2] (iv) By substituting x =

π2 π4 π , show that ln 2 ≈ + . [3] 4 16 1536

Exercise 447. (9233 N2007/I/14. Answer on p. 1262.) A family of curves is given by x2 − y 2 = Ax, where A is an arbitrary constant. (i) Show that

dy x2 + y 2 = . dx 2xy

(ii) A second, related family of curves is given by the differential equation

dy 2xy =− 2 . dx x + y2

By substituting y = vx, where v is a function of x, show that, for the second family of dv 3v + v 3 curves, x = − . [4] dx 1 + v2

(iii) Hence show that the second family of curves is given by 3x2 y + y 3 = C, where C is an arbitrary constant. Exercise 448. (9233 N2006/I/7. Answer on p. 1263.) A hollow cone of semi-vertical angle 45○ is held with its axis vertical and vertex downwards (see diagram). At the beginning of an experiment, it is filled with 390 cm3 of liquid. The liquid runs out through a small hole at the vertex at a constant rate of 2 cm3 s−1 . Find the rate at which the depth of the liquid is decreasing 3 minutes after the start of the experiment. [6]

45

Page 813, Table of Contents

www.EconsPhDTutor.com

Exercise 449. (9233 N2006/I/8. Answer on p. 1263.) Find the coordinates of the points on the curve 3x2 + xy + y 2 = 33 at which the tangent is parallel to the x-axis. [7] Exercise 450. (9233 N2006/I/9. Answer on p. 1263.) (i) Use the derivative of cos θ to d sec θ show that = sec θ tan θ. [2] dθ (ii) Use the substitution x = sec θ − 1 to find the exact value of 1

.

∫√2−1

1 √ dx. [6] (x + 1) x2 + 2x

1 + x − 2x2 Exercise 451. (9233 N2006/I/12. Answer on p. 1264.) (i) Express f (x) = (2 − x)(1 + x2 ) in partial fractions. [4] (ii) Expand f (x) in ascending powers of x, up to and including the term in x2 . [5] (iii) State the set of values of x for which the expansion is valid. [1]

Page 814, Table of Contents

www.EconsPhDTutor.com

Exercise 452. (9233 N2006/I/14. Answer on p. 1265.) A curve has parametric equations x = ct, y = c/t, where c is a positive constant. Three points P (cp, c/p), Q (cq, c/q), R (cr, c/r) on the curve are shown in the diagram.

y

Q

x

R P

(i) Prove that the gradient of QR is −1/qr. [2]

(ii) Given that the line through P perpendicular to QR meets the curve at V (cv, c/v), find v in terms of p, q and r. [2] (iii) Find the gradient of the normal at P . [3]

(iv) The normal at P meets the curve again at S (cs, c/s). Show that s = −1/p3 . [2]

(v) Given that angle QP R is 90○ , prove that QR is parallel to the normal at P . [3]

√ Exercise 453. (9233 N2006/II/2. Answer on p. 1265.) (i) Given that z = x/ x2 + 32, −1.5 show that dz/dx = 32 (x2 + 32) . [3] −1.5

(ii) Find the exact value of the area of the region bounded by the curve y = (x2 + 32) the x-axis and the lines x = 2 and x = 7. [3] Page 815, Table of Contents

,

www.EconsPhDTutor.com

79

Past-Year Questions for Part VI: Prob. and Stats.

Exercise 454. (9740 N2015/II/5. Answer on p. 1266.) You can skip this entire question if you’re taking the 9758 (revised) exam. The manager of a busy supermarket wishes to conduct a survey of the opinions of customers of different ages about different types of cola drink. (i) Give a reason why the manager would not be able to use stratified sampling. [1] (ii) Explain briefly how the manager could carry out a survey using quota sampling. [2] (iii) Give one reason why quota sampling would not necessarily provide a sample which is representative of the customers of the supermarket. [1]

Exercise 455. (9740 N2015/II/6. Answer on p. 1266.) ‘Droppers’ are small sweets that are made in a variety of colours. Droppers are sold in packets and the colours of the sweets in a packet are independent of each other. On average, 25% of Droppers are red. (i) A small packet of Droppers contains 10 sweets. Find the probability that there are at least 4 red sweets in a small packet. [2] You can skip parts (ii) and (iii) if you’re taking the 9758 (revised) exam. A large packet of Droppers contains 100 sweets. (ii) Use a suitable approximation, which should be stated, to find the probability that a large packet contains at least 30 red sweets. [3] (iii) Yip buys 15 large packets of Droppers. Find the probability that no more than 3 of these packets contain at least 30 red sweets. [2]

Exercise 456. (9740 N2015/II/7. Answer on p. 1267.) You can skip this entire question if you’re taking the 9758 (revised) exam. The average number of errors per page for a certain daily newspaper is being investigated. (i) State, in context, two assumptions that need to be made for the number of errors per page to be well modelled by a Poisson distribution. [2] Assume that the number of errors per page has the distribution Po(1.3). (ii) Find the probability that, on one day, there are more than 10 errors altogether on the first 6 pages. [3] (iii) The probability that there are fewer than 2 errors altogether on the first n pages of the newspaper is less than 0.05. Write down an inequality in terms of n to represent this information, and hence find the least possible value of n. [2]

Page 816, Table of Contents

www.EconsPhDTutor.com

Exercise 457. (9740 N2015/II/8. Answer on p. 1267.) A market stall sells pineapples which have masses that are normally distributed. The stall owner claims the mean mass of the pineapples is at least 0.9 kg. Nur buys a random selection of 8 pineapples from the stall. The 8 pineapples have masses, in kg, as follows. 0.80 1.000 0.82 0.85 0.93 0.96 0.81 0.89 (i) Find unbiased estimates of the population mean and variance of the mass of pineapples. You can skip part (ii) of this question if you’re taking the 9758 (revised) exam. (ii) Test the stall owner’s claim at the 10% level of significance. [7] Exercise 458. (9740 N2015/II/9. Answer on p. 1268.) For events A, B and C it is given that P (A) = 0.45, P (B) = 0.4, P (C) = 0.3 and P (A ∩ B ∩ C) = 0.1. It is also given that events A and B are independent, and that events A and C are independent. (i) Find P (B∣A). [1]

(ii) Given also that events B and C are independent, find P (A′ ∩ B ′ ∩ C ′ ). [3]

(iii) Given instead that events B and C are not independent, find the greatest and least possible values of P (A′ ∩ B ′ ∩ C ′ ). [4] Exercise 459. (9740 N2015/II/10. Answer on p. 1269.) In an experiment the following information was gathered about air pressure P , measured in inches of mercury, at different heights above sea-level h, measured in feet. h 2000 5000 10000 15000 20000 25000 30000 35000 40000 45000 P 27.8 24.9 20.6 16.9 13.8 11.1 8.89 7.04 5.52 4.28 (i) Draw a scatter diagram for these values, labelling the axes. [1] (ii) Find, correct to 4 decimal places, the product moment correlation coefficient between √ (a) h and P ; (b) ln h and P ; (c) h and P . [3]

(iii) Using the most appropriate case from part (ii), find the equation which best models air pressure at different heights. [3]

(iv) Given that 1 metre = 3.28 feet, re-write your equation from part (iii) so that it can be used to estimate the air pressure when the height is given in metres. [2]

Page 817, Table of Contents

www.EconsPhDTutor.com

Exercise 460. (9740 N2015/II/11. Answer on p. 1270.) This question is about arrangements of all eight letters in the word CABBAGES. (i) Find the number of different arrangements of the eight letters that can be made. [2] (ii) Write down the number of these arrangements in which the letters are not in alphabetical order. [1] (iii) Find the number of different arrangements that can be made with both the A’s together and both the B’s together. [2] (iv) Find the number of different arrangements that can be made with no two adjacent letters the same. [4]

Exercise 461. (9740 N2015/II/12. Answer on p. 1270.) In this question you should state clearly the values of the parameters of any normal distribution you use. The masses in grams of apples have the distribution N (300, 202 ) and the masses in grams of pears have the distribution N (200, 152 ). A certain recipe requires 5 apples and 8 pears. (i) Find the probability that the total mass of 5 randomly chosen apples is more than 1600 grams. [2]

(ii) Find the probability that the total mass of 5 randomly chosen apples is more than the total mass of 8 randomly chosen pears. [3] The recipe requires the apples and pears to be prepared by peeling them and removing the cores. This process reduces the mass of each apple by 15% and the mass of each pear by 10%. (iii) Find the probability that the total mass, after preparation, of 5 randomly chosen apples and 8 randomly chosen pears is less than 2750 grams. [4]

Exercise 462. (9740 N2014/II/5. Answer on p. 1271.) You can skip this entire question if you’re taking the 9758 (revised) exam. An Internet retailer has compiled a list of 10000 regular customers and wishes to carry out a survey of customer opinions involving 5% of its customers. (i) Describe how the marketing manager could choose customers for this survey using systematic sampling. [2] (ii) Give one advantage and one disadvantage of systematic sampling in this context. [2]

Page 818, Table of Contents

www.EconsPhDTutor.com

Exercise 463. (9740 N2014/II/6. Answer on p. 1271.) A team in a particular sport consists of 1 goalkeeper, 4 defenders, 2 midfielders and 4 attackers. A certain club has 3 goalkeepers, 8 defenders, 5 midfielders and 6 attackers. (i) How many different teams can be formed by the club? [2] One of the midfielders in the club is the brother of one of the attackers in the club. (ii) How many different teams can be formed which include exactly one of the two brothers? [3] The two brothers leave the club. The club manager decides that one of the remaining midfielders can play either as a midfielder or as a defender. (iii) How many different teams can now be formed by the club? [3]

Exercise 464. (9740 N2014/II/7. Answer on p. 1272.) Yan is carrying out an experiment with a fair 6-sided die and a biased 6-sided die, each numbered from 1 to 6. (i) Yan rolls the fair die 10 times. Find the probability that it shows a 6 exactly thrice. [1] You can skip parts (ii) and (iii) if you’re taking the 9758 (revised) exam. (ii) Yan now rolls the fair dies 60 times. Use a suitable approximate distribution, which should be stated, to find the probability that the die shows a 6 between 5 and 8 times, inclusive. [3] The probability that the biased die shows a 6 is 1/15. (iii) Yan rolls the biased die 60 times. Use a suitable approximate distribution, which should be stated, to find the probability that the biased die shows a 6 between 5 and 8 times, inclusive. [3]

Page 819, Table of Contents

www.EconsPhDTutor.com

Exercise 465. (9740 N2014/II/8. Answer on p. 1273.) (a) Sketch a scatter diagram that might be expected when x and y related approximately by y = px2 +t in each of the cases (i) and (ii) below. In each case your diagram should include 6 points, approximately equally spaced with respect to x, and with all x-values positive. (i) p and t are both positive. (ii) p is negative and t is positive. [2] (b) The age in months (m) and prices in dollars (P ) of a random sample of ten used cars of a certain model are given in the table. 11 20 28 36 40 47 58 62 68 75 m P 112800 102600 76500 72000 72000 69000 65800 57000 50600 47600 It is thought that the price after m months can be modelled by one of the formulae P = am + b,

where a, b, c and d are constants.

P = c ln m + d,

(i) Find, correct to 4 decimal places, the value of the product moment correlation coefficient between (A) m and P ; and (B) ln m and P . (ii) Explain which of P = am + b and P = c ln m + d is the better model and find the equation of a suitable regression line for this model. [3] (iii) Use the equation of your regression line to estimate the price of a car that is 50 months old. [1]

Exercise 466. (9740 N2014/II/9. Answer on p. 1274.) The number of minutes that the 0815 bus arrives late at my local bus stop has a normal distribution; the mean number of minutes the bus is late has been 4.3. A new company takes over the service, claiming that punctuality will be improved. After the new company takes over, a random sample of 10 days is taken and the number of minutes that the bus is late is recorded. The sample mean is t¯ minutes and the sample variance is k 2 minutes2 . A test is to be carried out at the 10% level of significance to determine whether the mean number of minutes late has been reduced. (i) State appropriate hypotheses for the test, defining any symbols that you use. [2] You can skip parts (ii) and (iii) of this question if you’re taking the 9758 (revised) exam. (ii) Given that k 2 = 3.2, find the set of values of t¯ for which the result of the test would be that the null hypothesis is not rejected. [4] (iii) Given instead that t¯ = 4.0, find the set of values of k 2 for which the result of the test would be to reject the null hypothesis. [3] Page 820, Table of Contents

www.EconsPhDTutor.com

Exercise 467. (9740 N2014/II/10. Answer on p. 1274.) A game has three sets of ten symbols, and one symbol from each set is randomly chosen to be displayed on each turn. The symbols are as follows. Set 1 + + + + × × × ◯ ◯ ⋆ Set 2 + + + × ◯ ◯ ◯ ◯ ⋆ ⋆ Set 3 + + × × × × ◯ ◯ ◯ ⋆

For example, if a + symbol is chosen from set 1, a ◯ symbol is chosen from set 2 and a ⋆ symbol is chosen from set 3, the display would be +◯⋆. (i) Find the probability that, on one turn, (a) ⋆ ⋆ ⋆ is displayed, [1]

(b) at least one ⋆ symbol is displayed, [2]

(c) two ×symbols and one + symbol are displayed in any order. [3]

(ii) Given that exactly one of the symbols displayed is ⋆, find the probability that the other two symbols are + and ◯. [4] Exercise 468. (9740 N2014/II/11. Answer on p. 1275.) You can skip this entire question if you’re taking the 9758 (revised) exam. An art dealers sells both original paintings and prints. (Prints are copies of paintings.) It is to be assumed that his sales of originals per week can be modelled by the distribution Po(2) and his sales of prints per week can be modelled by the independent distribution Po(11). (i) Find the probability that, in a randomly chosen week, (a) the art dealer sells more than 8 prints, [2] (b) the art dealer sells a total of fewer than 15 prints and originals combined. [2] (ii) The probability that the art dealer sells fewer than 3 originals in a period of n weeks is less than 0.01. Express this information as an inequality in n, and hence find the smallest possible integer value of n. [5] (iii) Using a suitable approximation, which should be stated, find the probability that the art dealer sells more than 550 prints in a year (52 weeks). [3] (iv) Give two reasons in context why the assumptions made at the start of this question may not be valid. [2]

Page 821, Table of Contents

www.EconsPhDTutor.com

Exercise 469. (9740 N2013/II/5. Answer on p. 1276.) A large multi-national company has 100000 employees based in several different countries. To celebrate the 90th anniversary of the founding of the company, the Chief Executive wishes to invite a representative sample of 90 employees to a party, to be held at the company’s Headquarters in Singapore. (i) Explain how random sampling could be carried out to choose the 90 employees. Explain briefly why this may not provide the representative sample that the Chief Executive wants. [2] You can skip part (ii) of this question if you’re taking the 9758 (revised) exam. (ii) Name a more appropriate sampling method, and explain how it can be carried out to provide the representative sample that the Chief Executive wants. [2]

Exercise 470. (9740 N2013/II/6. Answer on p. 1276.) The continuous random variable Y has the distribution N (µ, σ 2 ). It is known that P (Y < 2a) = 0.95 and P (Y < a) = 0.25. Express µ in the form ka, where k is a constant to be determined. [4] Exercise 471. (9740 N2013/II/7. Answer on p. 1276.) On average one in 20 packets of a breakfast cereal contains a free gift. Jack buys n packets from a supermarket. The number of these packets containing a free gift is the random variable F . (i) State, in context, two assumptions needed for F to be well modelled by a binomial distribution. [2] Assume now that F has a binomial distribution. (ii) Given that n = 20, find P (F = 1). [1]

You can skip part (iii) if you’re taking the 9758 (revised) exam. (iii) Given instead that n = 60, use a suitable approximation to find the probability that F is at least 5. State the parameter(s) of the distribution that you use. [3] Exercise 472. (9740 N2013/II/8. Answer on p. 1277.) For events A and B it is given that P (A) = 0.7, P (B∣A′ ) = 0.8 and P (A∣B ′ ) = 0.88. Find

(i) P (B ∩ A′ ), [1]

(ii) P (A′ ∩ B ′ ), [2] (iii) P (A ∩ B). [3]

Page 822, Table of Contents

www.EconsPhDTutor.com

Exercise 473. (9740 N2013/II/9. Answer on p. 1277.) A motoring magazine editor believes that the figures quoted by car manufacturers for distances travelled per litre of fuel are too high. He carries out a survey into this by asking for information from readers. For a certain model of car, 8 readers reply with the following data, measured in km per litre. 14.0

12.5

11.0

11.0

12.5

12.6

15.6

13.2

(i) Calculate unbiased estimates of the population mean and variance. [2] You can skip the rest of this question if you’re taking the 9758 (revised) exam. The manufacturer claims that this model of car will travel 13.8 km per litre on average. It is given that the distances travelled per litre for cars of this model are normally distributed. (ii) Stating a necessary assumption, carry out a t-test of the magazine editor’s belief at the 5% significance level. [5] Exercise 474. (9740 N2013/II/10. Answer on p. 1278.) (i) Sketch a scatter diagram that might be expected when x and y are related approximately as given in each of the cases (A), (B) and (C) below. In each case your diagram should include 6 points, approximately equally spaced with respect to x, and with all x- and y-values positive. The letters a, b, c, d, e and f represent constants. (A) y = a + bx2 , where a is positive and b is negative,

(B) y = c + d ln x, where c is positive and d is negative,

f (C) y = e + , where e is positive and f is negative. [3] x

A motoring website gives the following information about the distance travelled, y km, by a certain type of car at different speeds, x km h-1 , on a fixed amount of fuel. Speed, x 88 96 104 112 120 128 Distance, y 148 147 144 138 126 107 (ii) Draw the scatter diagram for these values, labelling the axes. [1] (iii) Explain which of the three cases in part (i) is the most appropriate for modelling these values, and calculate the product moment correlation coefficient for this case. [2] (iv) It is required to estimate the distance travelled at a speed of 110 km h-1 . Use the case that you identified in part (iii) to find the equation of a suitable regression line, and use your equation to find the required estimate. [3]

Page 823, Table of Contents

www.EconsPhDTutor.com

Exercise 475. (9740 N2013/II/11. Answer on p. 1279.) A machine is used to generate codes consisting of three letters followed by two digits. Each of the three letters generated is equally likely to be any of the twenty-six letters of the alphabet A - Z. Each of the two digits generated is equally likely to be any of the nine digits 1 - 9. The digit 0 is not used. Find the probability that a randomly chosen code has (i) three different letters and two different digits, [2] (ii) the second digit higher than the first digit, [2] (iii) exactly two letters the same or two digits the same, but not both, [4] (iv) exactly one vowel (A, E, I, O or U) and exactly one even digit. [4]

Exercise 476. (9740 N2013/II/12. Answer on p. 1280.) You can skip this entire question if you’re taking the 9758 (revised) exam. A company has two departments and each department records the number of employees absent through illness each day. Over a long period of time it is found that the average numbers absent on a day are 1.2 for the Administration Department and 2.7 for the Manufacturing Department. (i) State, in this context, two conditions that must be met for the numbers of absences to be well modelled by Poisson distributions. Explain why each of your two conditions may not be met. [3] For the remainder of this question assume that these conditions are met. You should assume also that absences in the two departments are independent of each other. (ii) Find the smallest number of days for which the probability that no employee is absent through illness from the Administration Department is less than 0.01. [2] Each employee absent on a day represents one ’day of absence’. So, one employee absent for 3 days contributes 3 days of absence, and 5 employees absent on 1 day contribute 5 days of absence. (iii) Find the probability that, in a 5-day period, the total number of days of absence in the two departments is more than 20. [3] (iv) Use a suitable approximation, which should be stated together with its parameter(s), to find the probability that, in a 60-day period, the total number of days of absence in the two departments is between 200 and 250 inclusive. [4]

Page 824, Table of Contents

www.EconsPhDTutor.com

Exercise 477. (9740 N2012/II/5. Answer on p. 1281.) The probability that a hospital patient has a particular disease is 0.001. A test for the disease has probability p of giving a positive result when the patient has the disease, and equal probability p of giving a negative result when the patient does not have the disease. A patient is given the test. (i) Given that p = 0.995, find the probability that (a) the result of the test is positive, [2]

(b) the patient has the disease given that the result of the test is positive. [2] (ii) It is given instead that there is a probability of 0.75 that the patient has the disease given that the result of the test is positive. Find the value of p, giving your answer correct to 6 decimal places. [3]

Exercise 478. (9740 N2012/II/6. Answer on p. 1281.) On a remote island a zoologist measures the tail lengths of a random sample of 20 squirrels. In a species of squirrel known to her, the tail lengths have mean 14.0 cm. She carries out a test, at the 5% significance level of whether squirrels on the island have the same mean tail length as the species known to her. She assumes that the tail lengths of squirrels on the island are normally distributed with standard deviation 3.8 cm. (i) State appropriate hypotheses for the test. [1] The sample mean tail length is denoted by x¯ cm. (ii) Use an algebraic method to calculate the set of values of x¯ for which the null hypothesis would not be rejected. (Answers obtained by trial and improvement from a calculator will obtain no marks.) [3] (iii) State the conclusion of the test in the case where x¯ = 15.8. [2] Exercise 479. (9740 N2012/II/7. Answer on p. 1282.) A group of fifteen people consists of one pair of sisters, one set of three brothers and ten other people. The fifteen people are arranged randomly in a line. (i) Find the probability that the sisters are next to each other. [2] (ii) Find the probability that the brothers are not all next to each other. [2] (iii) Find the probability that the sisters are next to each other and the brothers are all next to each other. [2] (iv) Find the probability that either the sisters are next to each other or the brothers are all next to each other or both. [2] Instead the fifteen people are arranged randomly in a circle. (iv) Find the probability that the sisters are next to each other. [1] Page 825, Table of Contents

www.EconsPhDTutor.com

Exercise 480. (9740 N2012/II/8. Answer on p. 1283.) Amy is revising for a mathematics examination and takes a different practice paper each week. Her marks, y% in week x, are as follows. Week x 1 2 3 4 5 6 Percentage mark y 38 63 67 75 71 82 (i) Draw a scatter diagram showing these marks. [1] (ii) Suggest a possible reason why one of the marks does not seem to follow the trend. [1] (iii) It is desired to predict Amy’s marks on future papers. Explain why, in this context, neither a linear nor a quadratic model is likely to be appropriate. [2]

It is decided to fit a model of the form ln(L − y) = a + bx, where L is a suitable constant. The product moment correlation coefficient between x and ln(L − y) is denoted by r. The following table gives values of r for some possible values of L. L r

91

92 93 -0.929944 -0.929918

(iv) Calculate the value of r for L = 91, giving your answer correct to 6 decimal places. [1]

(v) Use the table and your answer to part (iv) to suggest with a reason which of 91, 92 or 93 is the most appropriate value for L. [1] (vi) Using the value for L, calculate the values of a and b, and use them to predict the week in which Amy will obtain her first mark of at least 90%. [4] (vii) Give an interpretation, in context, of the value of L. [1] Exercise 481. (9740 N2012/II/9. Answer on p. 1284.) In an opinion poll before an election, a sample of 30 voters is obtained. (i) The number of voters in the sample who support the Alliance Party is denoted by A. State, in context, what must be assumed for A to be well modelled by a binomial distribution. [2] Assume now that A has the distribution B(30, p).

(ii) Given that p = 0.15, find P (A = 3 or 4). [2]

You can skip part (iii) if you’re taking the 9758 (revised) exam. However, you should still do part (iv).

(iii) Given instead that p = 0.55, explain whether it is possible to approximate the distribution of A with (a) a normal distribution; (b) a Poisson distribution. [3]

(iv) For an unknown value of p it is given that P (A = 15) = 0.06864 correct to 5 decimal places. Show that p satisfies an equation of the form p(1 − p) = k, where k is a constant to be determined. Hence find the value of p to a suitable degree of accuracy, given that p < 0.5. [5] Page 826, Table of Contents

www.EconsPhDTutor.com

Exercise 482. (9740 N2012/II/10. Answer on p. 1285.) You can skip this entire question if you’re taking the 9758 (revised) exam. Gold coins are found scattered throughout an archaeological site. (i) State two conditions needed for the number of gold coins found in a randomly chosen region of area 1 square metre to be well modelled by a Poisson distribution. [2] Assume that the number of gold coins in 1 square metre has the distribution Po(0.8). (ii) Find the probability that in 1 square metre there are at least 3 gold coins. [1] (iii) It is given that the probability that 1 gold coin is found in x square metres is 0.2. Write down an equation for x, and solve it numerically given that x < 1. [2]

(iv) Use a suitable approximation to find the probability that in 100 square metres there are at least 90 gold coins. State the parameter(s) of the distribution that you use. [3]

Pottery shards are also found scattered throughout the site. The number of pottery shards in 1 square metre is an independent random variable with the distribution Po(3). Use suitable approximations, whose parameters should be stated, to find (v) the probability that in 50 square metres the total number of gold coins and pottery shards is at least 200, [4] (vi) the probability that in 50 square metres there are at least 3 times as many pottery shards as gold coins. [3]

Exercise 483. (9740 N2011/II/5. Answer on p. 1286.) The continuous random variable X has the distribution N (µ, σ 2 ). It is known that P (X < 40.0) = 0.05 and P (X < 70.0) = 0.975. Calculate the values of µand σ. [4] Exercise 484. (9740 N2011/II/6. Answer on p. 1286.) You can skip this entire question if you’re taking the 9758 (revised) exam. It is desired to interview residents of a city suburb about the types of shop to be opened in a new shopping mall. In particular it is necessary to interview a representative range of ages. (i) Explain how a quota sample might be carried out in this context. [2] (ii) Explain a disadvantage of quota sampling in the context of your answer to part (i). [1] (iii) State the name of a method of sampling that would not have this disadvantage, and explain whether it would be realistic to use this method in this context. [2]

Page 827, Table of Contents

www.EconsPhDTutor.com

Exercise 485. (9740 N2011/II/7. Answer on p. 1287.) When I try to contact (by telephone) any of my friends in the evening, I know that on average the probability that I succeed is 0.7. On one evening I attempt to contact a fixed number, n, of different friends. If I do not succeed with a particular friend, I do not attempt to contact that friend again that evening. The number of friends whom I succeed in contacting is the random variable R. (i) State, in the context of this question, two assumptions needed to model R by a binomial distribution. [2] (ii) Explain why one of the assumptions stated in part (i) may not hold in this context. [1] Assume now that these assumptions do in fact hold. (iii) Given that n = 8, find the probability that R is at least 6. [1]

You can skip part (iv) of this question if you’re taking the 9758 (revised) exam. (iv) Given that n = 40, use an appropriate approximation to find P (R < 25). State the parameters of the distribution you use. [4] Exercise 486. (9740 N2011/II/8. Answer on p. 1288.) (i) Sketch a scatter diagram that might be expected for the case when x and y are related approximately by y = a + bx2 , where a is positive and b is negative. Your diagram should include 5 points, approximately equally spaced with respect to x, and with all x- and y-values positive. [1] The table gives the values of seven observations of bivariate data, x and y. x 2.0 2.5 3.0 3.5 4.0 4.5 5.0 y 18.8 16.9 14.5 11.7 8.6 4.9 0.8 (ii) Calculate the value of the product moment correlation coefficient, and explain why its value does not necessarily mean that the best model for the relationship between x and y is y = c + dx. [2] (iii) Explain how to use the values obtained by calculating product moment correlation coefficients to decide, for this data, whether y = a + bx2 or y = c + dx is the better model. [1]

(iv) It is desired to use the data in the table to estimate the value of y for which x = 3.2. Find the equation of the least-squares regression line of y on x2 . Use your equation to calculate the desired estimate. [3]

Page 828, Table of Contents

www.EconsPhDTutor.com

Exercise 487. (9740 N2011/II/9. Answer on p. 1289.) Camera lenses are made by two companies, A and B. 60% of all lenses are made by A and the remaining 40% by B. 5% of the lenses made by A are faulty. 7% of the lenses made by B are faulty. (i) One lens is selected at random. Find the probability that (a) it is faulty, [2] (b) it was made by A, given that it is faulty. [1] (ii) Two lenses are selected at random. Find the probability that (a) exactly one of them is faulty, [2] (b) both were made by A, given that exactly one is faulty. [3] (Author’s remark: Assume that there are infinitely many lenses.)

Exercise 488. (9740 N2011/II/10. Answer on p. 1289.) In a factory, the time in minutes for an employee to install an electronic component is a normally distributed continuous random variable T . The standard deviation of T is 5.0 and under ordinary conditions the expected value of T is 38.0. After background music is introduced into the factory, a sample of n components is taken and the mean time taken for randomly chosen employees to install them is found to be t¯ minutes. A test is carried out, at the 5% significance level, to determine whether the mean time taken to install a component has been reduced. (i) State appropriate hypotheses for the test, defining any symbols you use. [2] (ii) Given that n = 50, state the set of values of t¯ for which the result of the test would be to reject the null hypothesis. [3] (iii) It is given instead that t¯ = 37.1 and the result of the test is that the null hypothesis is not rejected. Obtain an inequality involving n, and hence find the set of values that n can take. [4]

Exercise 489. (9740 N2011/II/11. Answer on p. 1290.) A committee of 10 people is chosen at random from a group consisting of 18 women and 12 men. The number of women on the committee is denoted by R. (i) Find the probability that R = 4. [3]

(ii) The most probable number of women on the committee is denoted by r. By using the fact that P (R = r) > P (R = r + 1), show that r satisfies the inequality (r + 1)!(17 − r)!(9 − r)!(r + 3)! > r!(18 − r)!(10 − r)!(r + 2)!

and use this inequality to find the value of r. [5] Page 829, Table of Contents

www.EconsPhDTutor.com

Exercise 490. (9740 N2011/II/12. Answer on p. 1292.) You can skip this entire question if you’re taking the 9758 (revised) exam. The number of people joining an airport check-in queue in a period of 1 minute is a random variable with the distribution Po(1.2). (i) Find the probability that, in a period of 4 minutes, at least 8 people join the queue. [1] (ii) The probability that no more than 1 person joins the queue in a period of t seconds is 0.7. Find an equation for t. Hence find the value of t, giving your answer correct to the nearest whole number. [4] (iii) The number of people leaving the same queue in a period of 1 minute is a random variable with the distribution Po(1.8). At 0930 on a certain morning there are 35 people in the queue. Use appropriate approximations to find the probability that by 0945 there are at least 24 people in the queue, stating the parameters of any distributions that you use. (You may assume that the queue does not become empty during this period.) [5] (iv) Explain why a Poisson model would probably not be valid if applied to a time period of several hours. [1]

Exercise 491. (9740 N2010/II/5. Answer on p. 1292.) At an international athletics competition, it is desired to sample 1% of the spectators to find their opinions of the catering facilities. (i) Give a reason why it would be difficult to use a stratified sample. [1] (ii) Explain how a systematic sample could be carried out. [2]

Exercise 492. (9740 N2010/II/6. Answer on p. 1293.) The time required by an employee to complete a task is a normally distributed random variable. Over a long period it is known that the mean time required is 42.0 minutes. Background music is introduced in the workplace, and afterwards the time required, t minutes, is measured for a random sample of 11 employees. The results are summarised as follows. n = 11,

∑ t = 454.3,

∑ t2 = 18779.43.

(i) Find unbiased estimates of the population mean and variance. You can skip part (ii) of this question if you’re taking the 9758 (revised) exam. (ii) Test, at the 10% significance level, whether there has been a change in the mean time required by an employee to complete the task. [7]

Page 830, Table of Contents

www.EconsPhDTutor.com

Exercise 493. (9740 N2010/II/7. Answer on p. 1293.) For events A and B it is given that P (A) = 0.7, P (B) = 0.6 and P (A∣B ′ ) = 0.8. Find (i) P (A ∩ B ′ ), [2]

(ii) P (A ∪ B), [2] (iii) P (B ′ ∣A). [2]

For a third event C, it is given that P (C) = 0.5 and that A and C are independent. (iv) Find P (A′ ∩ C). [2]

(v) Hence state an inequality satisfied by P (A′ ∩ B ∩ C). [1]

(Author’s remark: The last part is a terribly vague question. A possible answer is this: “P (A′ ∩ B ∩ C) is a probability and hence by the First Kolmogorov Axiom, P (A′ ∩ B ∩ C) ≥ −500.” But this is probably not the answer wanted by the writers of this question. In my answer, I’ll simply give the best guess of what I think the writers wanted.)

Exercise 494. (9740 N2010/II/8. Answer on p. 1294.) The digits 1, 2, 3, 4 and 5 are arranged randomly to form a five-digit number. No digit is repeated. Find the probability that (i) the number is greater than 30000, [1] (ii) the last two digits are both even, [2] (iii) the number is greater than 30000 and odd. [4]

Exercise 495. (9740 N2010/II/9. Answer on p. 1294.) In this question you should state clearly the values of the parameters of any normal distribution you use. Over a three-month period Ken makes X minutes of peak-rate telephone calls and Y minutes of cheap-rate calls. X and Y are independent random variables with the distributions N (180, 302 ) and N (400, 602 ) respectively.

(i) Find the probability that, over a three-month period, the number of minutes of cheaprate calls made by Ken is more than twice the number of minutes of peak-rate calls. [4] Peak-rate calls cost $0.12 per minute and cheap-rate calls cost $0.05 per minute. (ii) Find the probability that, over a three-month period, the total cost of Ken’s calls is greater than $45. [3] (iii) Find the probability that the total cost of Ken’s peak-rate calls over two independent three-month periods is greater than $45. [3]

Page 831, Table of Contents

www.EconsPhDTutor.com

Exercise 496. (9740 N2010/II/10. Answer on p. 1295.) A car is placed in a wind tunnel and the drag force F for different wind speeds v, in appropriate units, is recorded. The results are shown in the table. v F

0 4 8 12 16 20 24 28 32 36 0 2.5 5.1 8.8 11.2 13.6 17.6 22.0 27.8 33.9

(i) Draw the scatter diagram for these values, labelling the axes clearly. [2] It is thought that the drag force F can be modelled by one of the formulae F = a + bv

where a, b, c and d are constants.

or F = c + dv 2

(ii) Find, correct to 4 decimal places, the value of the product moment correlation coefficient between (a) v and F , (b) v 2 and F . [2] (iii) Use your answers to parts (i) and (ii) to explain which of F = a + bv or F = c + dv 2 is the better model. [1]

(iv) It is required to estimate the value of v for which F = 26.0. Find the equation of a suitable regression line, and use it to find the required estimate. Explain why neither the model F = a + bv nor the model F = c + dv 2 should be used.86 [4] Exercise 497. (9740 N2010/II/11. Answer on p. 1296.) You can skip this entire question if you’re taking the 9758 (revised) exam. In this question you should state clearly all distributions that you use, together with the values of the appropriate parameters. The number of telephone calls received by a call centre in one minute is a random variable with distribution Po(3). (i) Find the probability that exactly 8 calls are received in a randomly chosen period of 4 minutes. [2] (ii) Find the length of time, to the nearest second, for which the probability that no calls are received is 0.2. [3] (iii) Use a suitable approximation to find the probability that, on a randomly chosen working day of 12 hours, more than 2200 calls are received. [4] A working day of 12 hours on which more than 2200 calls are received is said to be ‘busy’. (iv) Find the probability that, in six randomly chosen working days, exactly two are busy. [2] (v) Use a suitable approximation to find the probability that, in 30 randomly chosen working days of 12 hours, fewer than 10 are busy. [4] 86

I have changed the wording of this sentence slightly.

Page 832, Table of Contents

www.EconsPhDTutor.com

Exercise 498. (9740 N2009/II/5. Answer on p. 1296.) You can skip this entire question if you’re taking the 9758 (revised) exam. A cinema manager wishes to take a survey of opinions of cinema-goers. Describe how a quota sample of size 100 might be obtained, and state one disadvantage of quota sampling. [3]

Exercise 499. (9740 N2009/II/6. Answer on p. 1297.) The table gives the world record time, in seconds above 3 minutes 30 seconds, for running 1 mile as at 1st January in various years. Year, x 1930 1940 1950 1960 1970 1980 1990 2000 Time, t 40.4 36.4 31.3 24.5 21.1 19.0 16.3 13.1 (i) Draw a scatter diagram to illustrate the data. [2] (ii) Comment on whether a linear model would be appropriate, referring both to the scatter diagram and the context of the question. [2] (iii) Explain why in this context a quadratic model would probably not be appropriate for long-term predictions. [1] (iv) Fit a model of the form ln t = a + bx to the data and use it to predict the world record time as at 1st January 2010. Comment on the reliability of your prediction. [3] Exercise 500. (9740 N2009/II/7. Answer on p. 1298.) A company buys p% of its electronic components from supplier A and the remaining (100 − p)% from supplier B. The probability that a randomly chosen component supplied by A is faulty is 0.05. The probability that a randomly chosen component supplied by B is faulty is 0.03. (i) Given that p = 25, find the probability that a randomly chosen component is faulty. [2]

(ii) For a general value of p, the probability that a randomly chosen component that is faulty 0.05p was supplied by A is denoted by f (p). Show that f (p) = . Prove by differentiation 0.02p + 3 that f is an increasing function for 0 ≤ p ≤ 100, and explain what this statement means in the context of the question. [6] Exercise 501. (9740 N2009/II/8. Answer on p. 1298.) Find the number of ways in which the letters of the word ELEVATED can be arranged if (i) there are no restrictions, [1] (ii) T and D must not be next to one another, [2] (iii) consonants (L, V, T, D) and vowels (E, A) must alternate, [3] (iv) between any two Es there must be at least 2 other letters. [3] Page 833, Table of Contents

www.EconsPhDTutor.com

Exercise 502. (9740 N2009/II/9. Answer on p. 1299.) The thickness in cm of a mechanics textbook is a random variable with the distribution N (2.5, 0.12 ).

¯ cm. (i) The mean thickness of n randomly chosen mechanics textbooks is denoted by M ¯ > 2.53) = 0.0668, find the value of n. [3] Given that P (M

The thickness in cm of a statistics textbook is a random variable with the distribution N (2.0, 0.082 ).

(ii) Calculate the probability that 21 mechanics textbooks and 24 statistics textbooks will fit into a bookshelf of length 1 m. State clearly the mean and variance of any normal distribution you use in your calculation. (iii) Calculate the probability that the total thickness of 4 statistics textbooks is less than three times the thickness of 1 mechanics textbook. State clearly the mean and variance of any normal distribution you use in your calculation. [3] (iv) State an assumption needed for your calculation in parts (ii) and (iii) [1] Exercise 503. (9740 N2009/II/10. Answer on p. 1300.) A company supplies sugar in small packets. The mass of sugar in one packet is denoted by X grams. The masses of a random sample of 9 packets are summarised by ∑ x = 86.4,

∑ x2 = 835.92.

(i) Calculate unbiased estimates of the mean and variance of X. [2] You can skip the rest of this question if you’re taking the 9758 (revised) exam. The mean mass of sugar in a packet is claimed to be 10 grams. The company directors want to know whether the sample indicates that this claim is incorrect. (ii) Stating a necessary assumption, carry out a t-test at the 5% significance level. Explain why the Central Limit Theorem does not apply in this context. [7] (iii) Suppose now that the population variance of X is known, and that the assumption made in part (ii) is still valid. What change would there be in carrying out the test? [1]

Page 834, Table of Contents

www.EconsPhDTutor.com

Exercise 504. (9740 N2009/II/11. Answer on p. 1301.) A fixed number, n, of cars is observed and the number of those cars that are red is denoted by R. (i) State, in context, two assumptions needed for R to be well modelled by a binomial distribution. [2] Assume now that R has the distribution B(n, p). [2] (ii) Given that n = 20 and p = 0.15, find P (4 ≤ R < 8). [2]

You can skip parts (iii) and (iv) of this question if you’re taking the 9758 (revised) exam. But you should still do part (v). (iii) Given that n = 240 and p = 0.3, find P (R < 60) using a suitable approximation, which should be clearly stated.

(iv) Given that n = 240 and p = 0.02, find P (R = 3) using a suitable approximation, giving your answer correct to 4 decimal places and explaining why the approximation is appropriate in this case. [3] (v) Given that n = 20 and P (R = 0 or 1) = 0.2, write down an equation for the value of p, and find this value numerically. [2] Exercise 505. (9740 N2008/II/5. Answer on p. 1302.) You can skip this entire question if you’re taking the 9758 (revised) exam. A school has 950 pupils. (i) A sample of 50 pupils is to be chosen to take part in a survey. Describe how the sample could be chosen using systematic sampling. [2] The purpose of the survey is to investigate pupils’ opinions about the sports facilities available at the school. (ii) Give a reason why a stratified sample might be preferable in this context. [2]

Exercise 506. (9740 N2008/II/6. Answer on p. 1302.) You can skip this entire question if you’re taking the 9758 (revised) exam. In mineral water from a certain source, the mass of calcium, X mg, in a one-litre bottle is a normally distributed random variable with mean µ. Based on observations over a long period, it is known that µ = 78. Following a period of extreme weather, 15 randomly chosen bottles of the water were analysed. The masses of calcium in the bottles are summarised by ∑ x = 1026.0,

∑ x2 = 77265.90.

Test, at the 5% significance level, whether the mean mass of calcium in a bottle has changed. [6]

Page 835, Table of Contents

www.EconsPhDTutor.com

Exercise 507. (9740 N2008/II/7. Answer on p. 1302.) A computer game simulates a tennis match between two players, A and B. The match consists of at most three sets. Each set is won by either A or B, and the match is won by the first player to win two sets. The simulation uses the following rules. • The probability that A wins the first set is 0.6. • For each set after the first, the conditional probability that A wins that set, given that A won the preceding set, is 0.7. • For each set after the first, the conditional probability that B wins that set, given that B won the preceding set, is 0.8. Calculate the probability that (i) A wins the second set, [2] (ii) A wins the match, [3] (iii) B won the first set, given that A wins the match. [3] Exercise 508. (9740 N2008/II/8. Answer on p. 1303.) A certain metal discolours when exposed to air. To protect the metal against discolouring, it is treated with a chemical. In an experiment, different quantities, x ml, of the chemical were applied to standard samples of the metal, and the times, t hours, for the metal to discolour were measured. The results are given in the table. x 1.2 2.0 2.7 3.8 4.8 5.6 6.9 t 2.2 4.5 5.8 7.3 7.6 9.0 9.9 (i) Calculate the product moment correlation coefficient between x and t, and explain whether your answer suggests that a linear model is appropriate. [3] (ii) Draw a scatter diagram for the data. [1] One of the values t appears to be incorrect. (iii) Indicate the corresponding point on your diagram by labelling it P , and explain why the scatter diagram for the remaining points may be consistent with a model of the form t = a + b ln x. [2] (iv) Omitting P , calculate least square estimates of a and b for the model t = a + b ln x. [2] (v) Estimate the value of t at the value of x corresponding to P . [1]

(vi) Comment on the use of the model in part (iv) in predicting the value of t when x = 8.0. [1]

Page 836, Table of Contents

www.EconsPhDTutor.com

Exercise 509. (9740 N2008/II/9. Answer on p. 1304.) You can skip this entire question if you’re taking the 9758 (revised) exam. A shop sells two types of piano, ’grand’ and ’upright’. The mean number of grand pianos sold in a week is 1.8. (i) Use a Poisson distribution to find the probability that in a given week at least 4 grand pianos are sold. [2] The mean number of upright pianos sold in a week is 2.6. The sales of the two types of piano is independent. (ii) Use a Poisson distribution to find the probability that in a given week the total number of pianos sold is exactly 4. [2] (iii) Use a normal approximation to the Poisson distribution to find the probability that the number of grand pianos sold in a year of 50 weeks is less than 80. [4] (iv) Explain why the Poisson distribution may not be a good model for the number of grand pianos sold in a year. [2]

Exercise 510. (9740 N2008/II/10. Answer on p. 1305.) A group of diplomats is to be chosen to represent three islands, K, L and M . The group is to consist of 8 diplomats and is chosen from a set of 12 diplomats consisting of 3 from K, 4 from L and 5 from M . Find the number of ways in which the group can be chosen if it includes (i) 2 diplomats from K, 3 from L and 3 from M , [2] (ii) diplomats from L and M only, [2] (iii) at least 4 diplomats from M , [2] (iv) at least 1 diplomat from each island. [4]

Exercise 511. (9740 N2008/II/11. Answer on p. 1306.) The random variable X has the distribution N (50, 82 ). Given that X1 and X2 are two independent observations of X, find (i) P (X1 + X2 > 120), [2]

(ii) P (X1 > X2 + 15). [3]

The random variable Y is related to X by the formula Y = aX + b, where a and b are constants with a > 0.

(iii) Given that P (Y < 74) = P (Y > 146) = 0.0668, find the values of E(Y ) and Var(Y ), and hence find the values of a and b. [7]

Page 837, Table of Contents

www.EconsPhDTutor.com

Exercise 512. (9233 N2008/I/1. Answer on p. 1306.) On a bookshelf there are 15 different books; 6 have red covers, 5 have blue covers and 4 have green covers. All the red books are to be kept together, all the blue books are to be kept together and all the green books are to be kept together. In how many ways can the 15 books be arranged on the bookshelf? [3]

Exercise 513. (9233 N2008/II/23. Answer on p. 1306.) The events A, B and C are such that P (A) = 0.2, P (C) = 0.4, P (A ∪ B) = 0.4 and P (B ∩ C) = 0.1. Given that A and B are independent, find P (B) and show that B and C are also independent. [4] Exercise 514. (9233 N2008/II/26. Answer on p. 1307.) You can skip this entire question if you’re taking the 9758 (revised) exam. The number of times that an office photocopying machine breaks down in a week follows a Poisson distribution with mean 3. Find the probability that (i) the machine will break down more than twice in a given week, [2] (ii) the machine will break down at most three times in a period of four weeks. [3] (iii) Use a suitable approximation to find the probability that the machine will break down more than 50 times in a period of 16 weeks. [4]

Exercise 515. (9233 N2008/II/27. Answer on p. 1307.) The masses of a certain type of electronic component produced by a machine are normally distributed with mean 32.40 g. The machine is adjusted and a sample of 80 components is now taken and is found to have a mean mass 32.00 g. The unbiased estimate of the population variance, calculated from this sample, is 2.892 g2 . (i) Test at the 5% significance level whether this indicates a change in the mean. [5] (ii) Explain what you understand by the phrase ‘at the 5% significance’ in the context of this question. [2] (iii) Find the least level of significance at which this sample would indicate a decrease in the population mean. [3]

Page 838, Table of Contents

www.EconsPhDTutor.com

Exercise 516. (9233 N2008/II/29. Answer on p. 1308.) Mr Sim and Mr Lee work in the same office and are expected to arrive by 9 a.m. each day. Both men drive to work. (i) The time taken for Mr Sim’s journey follows a normal distribution with mean 50 minutes and standard deviation 4 minutes. Given that he regularly leaves home at 8.05 a.m., find the probability that he will be late no more than once in a working week of 5 days. [5] (ii) Mr Lee’s journey time follows a normal distribution with mean 40 minutes and standard deviation 5 minutes. Mr Lee leaves home at 8.10 a.m. each day. Find the probability that Mr Sim will arrive at work before Mr Lee on any particular day. [5] (iii) Find the probability that in a working week of 5 days, Mr Sim arrives at work before Mr Lee on at least 3 days. [2] Exercise 517. (9233 N2008/II/30. Answer on p. 1309.) (i) The masses of valves produced by a machine are normally distributed with mean µ and standard deviation σ. 12% of the valves have mass less than 86.50 g and 20% have mass more than 92.25 g. Find µ and σ. [4] (ii) The setting of the machine is adjusted so that the mean mass of the valves produced is unchanged, but the standard deviation is reduced. Given that 80% of the valves now have a mass within 2 g of the mean, find the new standard deviation. [3] (iii) After the machine has been adjusted, a random sample of n valves is taken. Find the smallest value of n such that the probability that the sample mean exceeds µ by at least 0.50 g is at most 0.1. [5] Exercise 518. (9740 N2007/II/5. Answer on p. 1309.) You can skip this entire question if you’re taking the 9758 (revised) exam. (i) Give a real-life example of a situation in which quota sampling could be used. Explain why quota sampling would be appropriate in this situation, and describe briefly any disadvantage that quota sampling has. [4] (ii) Explain briefly whether it would be possible to use stratified sampling in the situation you have described in part (i). [1] Exercise 519. (9740 N2007/II/6. Answer on p. 1310.) In a large population, 24% have a particular gene A, and 0.3% have gene B. (i) Find the probability that, in a random sample of 10 people from the population, at most 4 have gene A. [2] You can skip the rest of this question if you’re taking the 9758 (revised) exam. A random sample of 1000 people is taken from the population. Using appropriate approximations, find (ii) the probability that between 230 and 260 inclusive have gene A, [3] (iii) the probability that at least 2 but fewer than 5 have gene B. [2] Page 839, Table of Contents

www.EconsPhDTutor.com

Exercise 520. (9740 N2007/II/7. Answer on p. 1310.) A large number of students in a college have completed a geography project. The time, x hours, taken by a student to complete the project is noted for a random sample of 150 students. The results are summarised by ∑ x = 4626,

∑ x2 = 147691.

(i) Find unbiased estimates of the population mean and variance. [2] (ii) Test, at the 5% significance level, whether the population mean time for a student to complete the project exceeds 30 hours. [4] You can skip part (iii) of this question if you’re taking the 9758 (revised) exam. (iii) State giving a valid reason, whether any assumptions about the population are needed in order for the test to be valid. [1]

Exercise 521. (9740 N2007/II/8. Answer on p. 1311.) Chickens and turkeys are sold by weight. The masses, in kg, of chickens and turkeys are modelled as having independent normal distributions with means and standard deviations as shown in the table. Mean Mass Standard Deviation Chickens 2.2 0.5 10.5 2.1 Turkeys Chickens are sold at $3 per kg and turkeys at $5 per kg. (i) Find the probability that a randomly chosen chicken has a selling price exceeding $7. [2] (ii) Find the probability of the event that both a randomly chosen chicken has a selling price exceeding $7 and a randomly chosen turkey has a selling price exceeding $55. [3] (iii) Find the probability that the total selling price of a randomly chosen chicken and a randomly chosen turkey is more than $62. [4] (iv) Explain why the answer to part (iii) is greater than the answer to part (ii) [1]

Page 840, Table of Contents

www.EconsPhDTutor.com

Exercise 522. (9740 N2007/II/9. Answer on p. 1311.) A group of 12 people consists of 6 married couples. (i) The group stand in a line. (a) Find the number of different possible orders. [1] (b) Find the number of different possible orders in which each man stands next to his wife. [3] (ii) The group stand in a circle. (a) Find the number of different possible arrangements. [1] (b) Find the number of different possible arrangements if men and women alternate. [2] (c) Find the number of different possible arrangements if each man stands next to his wife and men and women alternate. [2]

Exercise 523. (9740 N2007/II/10. Answer on p. 1312.) A player throws three darts at a 1 target. The probability that he is successful in hitting the target with his first throw is . 8 For each of his second and third throws, the probability of success is • twice the probability of success on the preceding throw if that throw was successful, • the same as the probability of success on the preceding throw if that throw was unsuccessful. Construct a probability tree showing this information. [3] Find (i) the probability that all three throws are successful, [2] (ii) the probability that at least two throws are successful, [2] (i) the probability that the third throw is successful given that exactly two of the three throws are successful. [4]

Page 841, Table of Contents

www.EconsPhDTutor.com

Exercise 524. (9740 N2007/II/11. Answer on p. 1313.) Research is being carried out into how the concentration of a drug in the bloodstream varies with time, measured from when the drug is given. Observations at successive times give the data shown in the following table. 15 30 60 90 120 150 180 240 300 Time (t minutes) Concentration (x micrograms per litre) 82 65 43 37 22 19 12 6 2 It is given that the value of the product moment correlation coefficient for this data is −0.912, correct to 3 decimal places. The scatter diagram for the data is shown below.

100 x (micrograms per litre) 90 80 70 60 50 40 30 20 10 0 0 50 100 150

t (minutes)

200

250

300

(i) Calculate the equation of the regression line of x on t. [2] (ii) Calculate the corresponding estimated value of x when t = 300, and comment on the suitability of the linear model. [2] The variable y is defined by y = ln x. For the variables y and t,

(iii) calculate the product moment correlation coefficient and comment on its value, [2] (iv) calculate the equation of the appropriate regression line. [3] (v) Use a regression line to give the best estimate that you can of the time when the drug concentration is 15 micrograms per litre. [2] Page 842, Table of Contents

www.EconsPhDTutor.com

Exercise 525. (9233 N2007/I/4. Answer on p. 1313.) The diagram shows two straight lines, ABCD and AEF GHIJ, which intersect at A. Triangles are to be drawn using three of the points A, B, C, D, E, F , G, H, I, J as vertices.

(i) How many different triangles can be drawn which have the point A as one of the vertices? [1] (ii) How many different triangles in total can be drawn? [4] Exercise 526. (9233 N2007/II/23. Answer on p. 1314.) (i) A random sample of size 100 is taken from a population with mean 30 and standard deviation 5. Find an approximate value for the probability that the sample mean lies between 29.2 and 30.8. [6] (ii) Giving a reason, state whether it is necessary to make any assumptions about the distribution of the population. [1]

Page 843, Table of Contents

www.EconsPhDTutor.com

Exercise 527. (9233 N2007/II/25. Answer on p. 1314.) The numbers of men and women studying Chemistry, Physics and Biology at a college are given in the following table. Chemistry Physics Biology Men 12 16 32 Women 8 12 20 One of these students is chosen at random by a researcher. Events M , W , C and B are defined as follows. • M : the student chosen is a man • W : the student chosen is a woman. • C: the student chosen is studying Chemistry • B: the student chosen is studying Biology Find (i) P (W ∣B). [1] (ii) P (B∣W ). [1]

(iii) P (B ∪ W ). [2]

State, with a reason in each case, whether W and B are independent, and whether M and C are mutually exclusive. [4] Exercise 528. (9233 N2007/II/26. Answer on p. 1314.) You can skip this entire question if you’re taking the 9758 (revised) exam. At a fire station, each call-out is classified as either genuine or false. Call-outs occur at random times. On average, there are two genuine callouts in a week, and one false call-out in a two-week period. (i) Calculate the probability that there are fewer than 6 genuine call-outs in a randomly chosen two-week period. (ii) Using a suitable approximation, calculate the probability that the total number of call-outs in a randomly chosen six-week period exceeds 19.

Page 844, Table of Contents

www.EconsPhDTutor.com

Exercise 529. (9233 N2007/II/27. Answer on p. 1315.) An oil mixture is produced by mixing L litres of light oil with H litres of heavy oil. The random variables L and H are independent normal variables. The expected value of L is 5 and its standard deviation is 0.1. The expected value of H is 3 and its standard deviation is 0.05. (i) Find the probability that the volume of the mixture lies between 7.9 litres and 8.2 litres. [6] The density of light oil is 0.74 kilograms per litre, and the density of heavy oil is 0.86 kilograms per litre. (ii) Find the probability that the mass of the mixture lies between 6.1 kg and 6.2 kg. [6] [Density is defined by Density =

Mass .] Volume

Exercise 530. (9233 N2006/I/4. Answer on p. 1315.) A box contains 8 balls, of which 3 are identical (and so are indistinguishable from one another) and the other 5 are different from each other. 3 balls are to be picked out of the box; the order in which they are picked out does not matter. Find the number of different possible selections of 3 balls. [4] (Author’s remark: Assume also that the latter 5 balls are each different from the first 3 balls.) Exercise 531. (9233 N2006/II/23. Answer on p. 1316.) Two fair dice, one red and the other green, are thrown. • A is the event: The score on the red die is divisible by 3. • B is the event: The sum of two scores is 9. (i) Justifying your conclusion, determine whether A and B are independent. [3] (ii) Find P (A ∪ B). [2]

Exercise 532. (9233 N2006/II/25. Answer on p. 1316.) The mass of vegetables in a randomly chosen bag has a normal distribution. The mass of the contents of a bag is supposed to be 10 kg. A random sample of 80 bags is taken and the mass of the contents of each bag, x grams, is measured. The data are summarised by ∑(x − 10000) = −2510,

∑(x − 10000)2 = 2010203.

(i) Test, at the 5% significance level, whether the mean mass of the contents of a bag is less than 10 kg. [7] (ii) Explain, in the context of the question, the meaning of ’at the 5% significance level’. [1] Page 845, Table of Contents

www.EconsPhDTutor.com

Exercise 533. (9233 N2006/II/26. Answer on p. 1315.) You can skip this entire question if you’re taking the 9758 (revised) exam. In a weather model, severe floods are assumed to occur at random intervals, but at an average rate of 2 per 100 years. (i) Using this model, find the probability that, in a randomly chosen 200-year period, there is exactly one severe flood in the first 100 years and exactly one severe flood in the second 100 years. [3] (ii) Using the same model, and a suitable approximation, find the probability that there are more than 25 severe floods in 1000 years. [5]

Exercise 534. (9233 N2006/II/28. Answer on p. 1317.) Observations are made of the speeds of cars on a particular stretch of road during daylight hours. It is found that, on average, 1 in 80 cars is travelling at a speed exceeding 125 km h-1 , and 1 in 10 is travelling at a speed less than 40 km h-1 . (i) Assuming a normal distribution, find the mean and the standard deviation of this distribution. [4] (ii) A random sample of 10 cars is to be taken. Find the probability that at least 7 will be travelling at a speed in excess of 40 km h-1 . [3] You can skip part (iii) of this question if you’re taking the 9758 (revised) exam. (iii) A random sample of 100 cars is to be taken. Using a suitable approximation, find the probability that at most 8 cars will be travelling at a speed less than 40 km h-1 . [3]

Page 846, Table of Contents

www.EconsPhDTutor.com

Part VIII

Appendices (Optional) The discussion in the main text above has not always been complete, precise, and rigorous. In these appendices, I fill in these gaps. In particular, I give formal definitions, statements of claims, and proofs of claims. In general, where there is a trade-off between generality of a result and the simplicity of its proof, I favour the latter.

Page 847, Table of Contents

www.EconsPhDTutor.com

80

Appendices for Part I: Functions and Graphs 80.1

Sets

Fact 1. Two sets are subsets of each other ⇐⇒ they are identical.

Proof. (1) If every element in A is also in B and every element in B is also in A, then both sets contain exactly the same elements. By Definition 3 then, A = B. (2) If A = B, then both sets contain exactly the same elements. Hence, every element in A is also in B and every element in B is also in A.

Page 848, Table of Contents

www.EconsPhDTutor.com

80.2

Functions

Somewhat strangely, a function is formally defined to be a set! Definition 137. A function f ∶ D → C is any f ⊆ D × C where for each x ∈ D, there is a unique y ∈ C such that (x, y) ∈ f .

(D × C is an example of a cartesian product. D × C is the set of ordered pairs (d, c) such that d ∈ D and c ∈ C. For example, say D = Z+ and C = Z− . Then the set D × C contains elements such as (14, −1) and (3, −5), but not (−1, −1) or (3, 3).) When a function is given the above definition, there is no formal distinction between a function and its graph. A function is what we called its graph — that is, a function is a set of points.

If it seems strange to you that a function is defined to be a set, you might find it stranger still that every mathematical object can be defined in terms of sets. For example, in standard set theory, very strangely, the number 0 is defined to be the empty set {} = ∅. The number 1 is defined to be {0} = {{}} = {∅}. The number 2 is defined to be {0, 1} = {{} , {{}}} = {∅, {∅}}. The number 3 is defined to be {0, 1, 2} = {{} , {{}} , {{} , {{}}}} = {∅, {∅} , {∅, {∅}}}. Etc.

Sets are what mathematicians call a primitive notion. That is, sets are left undefined (though they do have to satisfy certain axioms). But having summoned out of the void this single undefined object called the set, mathematicians can then define every other mathematical object based on the set. It is in this sense that the set is the basic building block out of which every other mathematical object can be built. The idea is to have just one undefined object, then define everything else based on this single undefined object.

Page 849, Table of Contents

www.EconsPhDTutor.com

80.3

Reflection in a Line

Fact 5 (reproduced from p. 92). Let (a, b) be a point. Its reflection in the line y = x is the point (b, a). Proof. Let (p, q) be the reflection of the point (a, b) in the line y = x.

Consider the line through the points (a, b) and (p, q). It is perpendicular to the line y = x, whose slope is 1. And so the slope of this line must be −1. Thus, q − b = −1 × (p − a) or 1 q − b = a − p. Now consider the midpoint of the line segment connecting (a, b) and (p, q), namely (

a+p b+q , ). 2 2

This point must be on the line y = x. And so,

b+q a+p 2 = or b + q = a + p. 2 2

Taking = plus =, we have 2q = 2a or q = a. Taking = minus =, we have 2p = 2b or p = b. Altogether, (p, q) = (b, a), as desired. 1

2

2

1

Similarly,

Fact 6 (reproduced from p. 92). Let (a, b) be a point. Its reflection in the line y = −x is the point (−b, −a). Proof. Let (p, q) be the reflection of the point (a, b) in the line y = −x.

Consider the line through the points (a, b) and (p, q). It is perpendicular to the line y = x, whose slope is −1. And so the slope of this line must be 1. Thus, q − b = 1 × (p − a) or 1 q − b = p − a. Now consider the midpoint of the line segment connecting (a, b) and (p, q), namely (

a+p b+q , ). 2 2

This point must be on the line y = −x. And so,

b+q a+p 2 =− or b + q = −a − p. 2 2

Taking = plus =, we have 2q = −2a or q = −a. Taking = minus =, we have −2p = 2b or p = −b. Altogether, (p, q) = (−b, −a), as desired. 1

2

Page 850, Table of Contents

2

1

www.EconsPhDTutor.com

More generally, Fact 86. Let (p, q) be a point. Its reflection in the line ay + bx + c = 0 is the point (

p(a2 − b2 ) − 2b(aq + c) q(b2 − a2 ) − 2a(bp + c) , ). a2 + b2 a2 + b2

Proof. Consider the line that is perpendicular to the line of reflection and which contains (p, q). It can be written as −by+ax+d = 0, where d is an unknown. Since (p, q) is on this line, we have −bq +ap+d = 0, so that d = bq −ap. So the perpendicular line is −by +ax+bq −ap = 0. The intersection of the line of reflection and the perpendicular line we just found is given by the system of equations: ay + bx + c = 0 1

−by + ax + bq − ap = 0 2

(equation of line of reflection), (equation of perpendicular line).

Take b× = plus a× = and do the algebra to get x = 1

2

a2 p − b(aq + c) . a2 + b2

b2 q − a(bp + c) Similarly, take a× = minus b× = and do the algebra to get y = . a2 + b2 1

2

(x, y) is the midpoint between (p, q) and the reflection point we are looking for. Thus, our reflection point has x- and y-coordinates 2x − p = 2

a2 p − b(aq + c) p (a2 − b2 ) − 2b(aq + c) − p = , a2 + b2 a2 + b2

b2 q − a(bp + c) q (b2 − a2 ) − 2a(bp + c) 2y − q = 2 −q = . a2 + b2 a2 + b2

Fact 7 (reproduced from p. 94). Let f be an invertible function. Then the reflection of the graph of f in the line y = x is the graph of its inverse function f −1 .

Proof. By definition of the inverse function, f (a) = b ⇐⇒ f −1 (b) = a. Hence, (a, b) is in the graph of f ⇐⇒ (b, a) is in the graph of f −1 . But as we showed in Fact 5, the reflection of the point (a, b) in the line y = x is the point (b, a). And so the reflection of the graph of f in the line y = x is precisely the graph of f −1 .

Page 851, Table of Contents

www.EconsPhDTutor.com

80.4 Fact 87. The graph of y = Proof. Compute

The Hyperbola y =

bx + c has no turning points. dx + e

bx + c dx + e

dy d b cd − be 1 = [ + ] dx dx d d2 x + e/d =

d b d cd − be 1 + [ ] dx d dx d2 x + e/d

=0+ =

cd − be d 1 [ ] d2 dx x + e/d

1 cd − be [(−1) ]. d2 (x + e/d)2

Hence, dy/dx = 0 ⇐⇒ cd − be = 0. But if cd − be = 0, then y = a/c is a constant (i.e. its graph is just a horizontal line). Altogether then, there can be no turning points.

Page 852, Table of Contents

www.EconsPhDTutor.com

ax2 + bx + c The Hyperbola y = dx + e

80.5 First, do the long division

a x d dx + e ax2 ax2

b ae + − 2 d d +bx

+

+c

ae x d

(b −

(b −

ae )x d ae )x d

+c

b ae +( − 2)e d d c+(

ae b − )e d2 d

a b ae ae b x + − 2 and the “remainder” is c + ( 2 − ) e. Let’s see if we can d d d d d simplify this so that x in the denominator has no coefficient: The “quotient” is

ae ae b b b ae c + ( d2 − d ) e a b ae 1 c + ( d2 − d ) e ax2 + bx + c a = x+ − 2 + = x+ − 2 + dx + e d d d dx + e d d d d x + e/d

a bd − ae c + ( d2 − d ) e 1 a bd − ae d2 c + (ae − bd) e 1 = x+ + = x + + . d d2 d x + e/d d d2 d3 x + e/d ae

b

Recall that to rule out trivial cases, we assumed that d ≠ 0; c and e are not both 0; and a ≠ 0.

Now in addition, we’ll also assume that d2 c + (ae − bd) e = 0 (otherwise the function is a linear function). We now examine the hyperbola’s intercepts, turning points, asymptotes, centre, and lines of symmetry.

Page 853, Table of Contents

www.EconsPhDTutor.com

1. Intercepts. If e = 0, then the graph does not intersect the vertical axis. If e ≠ 0, then c the graph intersects the vertical axis at the point (0, ). e

The horizontal intercepts are given by the zeros of the equation ax2 + bx + c = 0. So if b2 −4ac <√0, then there are no horizontal intercepts. Otherwise the two horizontal intercepts −b ± b2 − 4ac (identical if b2 − 4ac = 0). And so the graph intersects the horizontal axis are 2a √ √ −b − b2 − 4ac −b + b2 − 4ac at the points ( , 0) and ( , 0). 2a 2a dy a d2 c + (ae − bd) e = − . Setting this equal to zero, we dx d (x + e/d)2 d3 d2 c + (ae − bd) e d2 y d2 c + (ae − bd) e 2 have (dx + e) = . Compute also 2 = 2 . And thus, a dx (dx + e)3

2. Turning points. Compute

(a) If d2 c + (ae − bd) e < 0, then there are no stationary points. (b) If d2 c + (ae − bd) e > 0, then the maximum and minimum turning points are: x=

−e −

√

d2 c+(ae−bd)e a

d

and x =

−e +

√

d2 c+(ae−bd)e a

d

.

3. Asymptotes. As x → −e/d, y → ±∞. Hence, there is one vertical asymptote x = −e/d. bd − ae a bd − ae a . Hence, there is a oblique asymptote: y = x + . As x → ±∞, y → x + d d2 d d2 These two asymptotes are not perpendicular and so this is not a rectangular hyperbola. e bd − 2ae 4. The centre (the point at which the two asymptotes intersect) (− , ). d d2 5. Symmetry. There are two lines of symmetry: ⎛a + y= ⎝d ⎛a y= − ⎝d

√ √

⎞ bd − ae e + 1 x + + d2 d2 d ⎠

a2

⎞ a2 bd − ae e + 1 x + − d2 d2 d ⎠

√ √

a2 + 1, d2 a2 + 1. d2

ax2 + bx + c The proof that the above are indeed the lines of symmetry for y = simply dx + e involves a load of messy and boring algebra, which we’ll omit.

Page 854, Table of Contents

www.EconsPhDTutor.com

81

Appendices for Part II: Sequences and Series

Here are the formal definitions of convergent and divergent sequences and series. Definition 138. Let (an ) = (a1 , a2 , a3 , . . . ) be an infinite sequence. Let > 0. If there exists N such that for any n ≥ N , an ∈ (L − , L + ), then the sequence (an ) is convergent; and moreover, it converges to L (L is called the limit of the sequence). We can also write (an ) → L. A sequence that is not convergent is divergent and its limit does not exist.

Definition 139. (Partial sums.) Let (a1 , a2 , a3 , . . . ) be an infinite sequence. We call k

Sk = ∑ ai this sequence’s kth partial sum. i=1

Definition 140. (Convergence of series.) Let (an ) = (a1 , a2 , a3 , . . . ) be an infinite sequence and for k = 1, 2, . . . , let Sk be this sequence’s kth partial sum. Form the infinite sequence (Sn ) = (S1 , S2 , S3 , . . . ). If (Sn ) → L ∈ R, then we say that (an ) has a convergent series and its sum of series exists and is equal to L. Fact 18 (reproduced from p. 239). If ∣r∣ ≥ 1, then a1 + a1 r + a1 r2 + a1 r3 + . . . diverges.

Proof. Let Sn = a1 + a1 r + a1 r2 + ⋅ ⋅ ⋅ + a1 rn−1 . Suppose for contradiction that as n → ∞, Sn → L.

Then there exists K ∶ ∣SK+1 − SK ∣ < a1 .

But ∣SK+1 − SK ∣ = ∣a1 rK ∣ ≥ a1 , a contradiction.

Page 855, Table of Contents

www.EconsPhDTutor.com

82

Appendices for Part III: Vectors 82.1

Vectors in 2D

Fact 22 (reproduced from p. 278). Let a and b be any two non-zero vectors. Then ˆ ⇐⇒ a can be written as a scalar multiple of b. ˆ=b a ̂= ˆ = cb Proof. ( ⇐Ô ) Suppose a = cb. Then a

b ˆ cb = = b. c ∣b∣ ∣b∣

ˆ Then a = a ˆ and b = b, ˆ so that indeed a = ∣a∣ a ˆ = ∣a∣ b ˆ = b. ˆ=b ˆ = ∣a∣ b ( Ô⇒ ) Suppose a ∣a∣ ∣b∣ ∣b∣ can be written as a scalar multiple of b.

Fact 23 (reproduced from p. 278 above). Let a and b be any two vectors in the same plane with distinct direction vectors. Then every vector in the same plane can be written as αa + βb for some α, β ∈ R. Proof. I prove only the 2D case. (For higher dimensions, it is much easier to use linear algebra, but this is not covered in H2 maths.) Let a = (a1 , a2 ) and b = (b1 , b2 ). Let c = (c1 , c2 ) be any vector.

Observe that a1 b2 ≠ a2 b1 , because if a1 b2 = a2 b1 , then a1 , a2 , b1 , b2 ≠ 0 (otherwise both a and a1 b1 b are zero vectors) and so = , in which case a and b have the same direction vector, a2 b2 contradicting our assumption. Then we do indeed have c = (αa1 + βb1 , αa2 + βb2 ) if we pick α and β such that αa1 + βb1 = c1 , 1

αa2 + βb2 = c2 . 2

Taking b2 × = minus b1 × = yields αa1 b2 − αa2 b1 = b2 c1 − b1 c2 . 1

2

3

Taking a2 × = minus a1 × = yields βa2 b1 − βa1 b2 = a2 c1 − a1 c2 . 1

2

4

Since a1 b2 ≠ a2 b1 , we can pick

α=

Page 856, Table of Contents

b2 c1 − b1 c2 a 1 b2 − a 2 b1

and β =

a1 c2 − a2 c1 . a1 b2 − a2 b1 www.EconsPhDTutor.com

82.2

Scalar Product

Fact 24 (reproduced from p. 281). Let a, b, and c be vectors. Then a⋅(b+c) = a⋅b+a⋅c. Moreover, (a + b) ⋅ c = a ⋅ c + b ⋅ c. Proof. I prove only the 2D case. Let a = (a1 , a2 ), b = (b1 , b2 ), and c = (c1 , c2 ). Then a ⋅ (b + c) = (a1 , a2 ) ⋅ [(b1 , b2 ) + (c1 , c2 )] = (a1 , a2 ) ⋅ (b1 + c1 , b2 + c2 ) = a1 (b1 + c1 ) + a2 (b2 + c2 ) = a1 b1 + a1 c1 + a2 b2 + a2 c2 = a1 b1 + a2 b2 + a1 c1 + a2 c2 = (a1 , a2 ) ⋅ (b1 , b2 ) + (a1 , a2 ) ⋅ (c1 , c2 ) = a ⋅ b + a ⋅ c.

The proof that (a + b) ⋅ c = a ⋅ c + b ⋅ c is very similar and is thus omitted. Fact 26 (reproduced from p. 282). Let u and v be two vectors (of any dimension) and θ ∈ [0, π] be the angle between them. Then u ⋅ v = ∣u∣ ∣v∣ cos θ.

Proof. Let u and v correspond to two sides of a triangle. Then u−v corresponds to the third side and θ is the angle opposite this third side. Then by Proposition 6 (Law of Cosines), ∣u − v∣ = ∣u∣ + ∣v∣ − 2 ∣u∣ ∣v∣ cos θ (u − v) ⋅ (u − v) = u ⋅ u + v ⋅ v − 2 ∣u∣ ∣v∣ cos θ u ⋅ u + v ⋅ v − 2u ⋅ v = u ⋅ u + v ⋅ v − 2 ∣u∣ ∣v∣ cos θ −2u ⋅ v = −2 ∣u∣ ∣v∣ cos θ u ⋅ v = ∣u∣ ∣v∣ cos θ u⋅v cos θ = . ∣u∣ ∣v∣ 2

⇐⇒ ⇐⇒ ⇐⇒ ⇐⇒ ⇐⇒

Page 857, Table of Contents

2

2

www.EconsPhDTutor.com

82.3

The Ratio Theorem

Theorem 3 (reproduced from p. 279.) Ratio Theorem. Let a and b be points. Let p be a point on the line segment ab. Then Ð → ∣bp∣

→ ∣Ð ap∣ p= Ð → a+ Ð Ð → b. Ð → → ∣ap∣ + ∣bp∣ ∣ap∣ + ∣bp∣ → → is in the same direction as Ð → is Proof. The vector Ð ap ab = b − a, but the length of Ð ap Ð → times that of ab. That is:

Ð →= ap Hence,

→ ∣Ð ap∣ → → + ∣Ð ∣Ð ap∣ bp∣

→ ∣Ð ap∣ → (b − a) . → + ∣Ð ∣Ð ap∣ bp∣

→ p=a+Ð ap

→ ∣Ð ap∣ =a+ Ð → (b − a) Ð → ∣ap∣ + ∣bp∣

→ → ⎞ ⎛ ∣Ð ∣Ð ap∣ ap∣ = ⎜1 − → ⎟a + Ð →b → + ∣Ð → + ∣Ð ⎝ ∣Ð ∣ap∣ ap∣ bp∣ ⎠ bp∣ Ð → ∣bp∣

→ ∣Ð ap∣ = → a+ Ð → b. → + ∣Ð → + ∣Ð ∣Ð ∣ap∣ ap∣ bp∣ bp∣

Page 858, Table of Contents

www.EconsPhDTutor.com

82.4

Vector Product

Fact 29 (reproduced from p. 295). Let u and v be two non-zero 2D vectors and θ ∈ [0, π] be the angle between them. Then the scalar u × v is equal to either ∣u∣ ∣v∣ sin θ or − ∣u∣ ∣v∣ sin θ.

Proof. Let α and β be the angles that u and v make with the positive x-axis (the angles are measured counter-clockwise). Then u × v = ux vy − uy vx

= ∣u∣ cos α ∣v∣ sin β − ∣u∣ sin α ∣v∣ cos β = ∣u∣ ∣v∣ (cos α sin β − sin α cos β) = ∣u∣ ∣v∣ sin (β − α) .

Case #1. If β ≥ α, then θ = β − α and so sin (β − α) = sin θ. Thus, u × v = ∣u∣ ∣v∣ sin θ, as desired.

Case #2. If α > β, then θ = α − β and so sin (β − α) = sin (−θ) = − sin θ. Thus, u × v = − ∣u∣ ∣v∣ sin θ, as desired.

Page 859, Table of Contents

www.EconsPhDTutor.com

Lemma 1. The parallelopiped with sides of lengths ∣a∣, ∣b∣, and ∣c∣ has volume a ⋅ (b × c).

Proof. The base of the parallelopiped has area ∣b × c∣.

Its height is the projection of a onto the vector that is perpendicular to both b and c. That ̂ is, its height is a ⋅ (b × c).

We know that the volume of a parallelopiped is equal to “Height × Base Area”.

̂ ˆ ∣v∣ = u ⋅ v. Hence, the volume of So in this case, it is a ⋅ (b × c) ∣b × c∣. We know that u ⋅ v the parallelopiped is also a ⋅ (b × c). The next Lemma is an easy corollary to the above Lemma. Lemma 2. a ⋅ (b × c) = (a × b) ⋅ c.

Proof. We can similarly prove that the parallelopiped with sides of lengths ∣a∣, ∣b∣, and ∣c∣ also has volume (a × b) ⋅ c. Fact 33 (reproduced from p. 300). Let a, b, and c be vectors. Then a × (b + c) = a × b + a × c. Moreover, (a + b) × c = a × c + b × c. Proof. We prove only that a × (b + c) = a × b + a × c.

(Proving that (a + b) × c = a × c + b × c is very similar.)

Let d = a × (b + c) − (a × b + a × c). Then

d ⋅ d = d ⋅ [a × (b + c) − (a × b + a × c)] = d ⋅ [a × (b + c)] − d ⋅ (a × b + a × c) = d ⋅ [a × (b + c)] − d ⋅ (a × b) − d ⋅ (a × c) = (d × a) ⋅ (b + c) − (d × a) ⋅ b − (d × a) ⋅ c = (d × a) ⋅ b + (d × a) ⋅ c − (d × a) ⋅ b − (d × a) ⋅ c = 0,

where the second and third lines follow from the distributivity of scalar product, and the fourth line uses 2. d ⋅ d = 0 ⇐⇒ d = 0. Thus, a × (b + c) = (a × b + a × c), as desired. Page 860, Table of Contents

www.EconsPhDTutor.com

Proposition 7 (reproduced from p. 300). Given two 3D vectors u = (ux , uy , uz ) and v = (vx , vy , vz ), their vector product is given by: ⎛ uy vz − uz vy u×v=⎜ ⎜ uz vx − ux vz ⎝ ux vy − uy vx

Proof.

u×v= =

= = =

⎞ ⎟. ⎟ ⎠

(ux i + uy j + uz k) × (vx i + vy j + vz k)

ux i × (vx i + vy j + vz k) + uy j × (vx i + vy j + vz k) + uz k × (vx i + vy j + vz k)

ux vx (i × i) + ux vy (i × j) + ux vz (i × k) + uy vx (j × i) + uy vy (j × j) + uy vz (j × k) + uz vx (k × i) + uz vy (k × j) + uz vz (k × k) 0 + ux vy k + ux vz (−j) + uy vx (−k) + 0 + uy vz i + uz vx j + uz vy (−i) + 0

(distributivity) (distributivity) (Fact 32)

(uy vz − uz vy ) i + (uz vx − ux vz ) j + (ux vy − uy vx ) k

=

Page 861, Table of Contents

⎛ uy vz − uz vy ⎜ u v −u v ⎜ z x x z ⎝ ux vy − uy vx

⎞ ⎟. ⎟ ⎠

www.EconsPhDTutor.com

82.5

2D Geometry

Fact 34 (reproduced from p. 307). The line with vector equation r = (p1 , p2 ) + λ(v1 , v2 ) (for λ ∈ R) is the line with cartesian equations as given by the 3 cases below. x − p1 y − p2 = , v1 v2

(1)

(2) (3)

x = p1 , y is free,

x is free, y = p2 ,

Proof. Write out:

if v1 , v2 ≠ 0;

if v1 = 0, v2 ≠ 0; if v2 = 0, v1 ≠ 0;

x = p1 + λv1 , 1

y = p2 + λv2 . 2

Eliminate λ by taking v2 × = minus v1 × =: 2

2

v2 x − v1 y = v2 p1 + λv1 v2 − v1 p2 − λv1 v2 = v2 p1 − v1 p2 .

Rearranging, we have v2 (x − p1 ) = v1 (y − p2 ).

(1) If v1 , v2 ≠ 0, then we can divide both sides by v1 v2 to find that indeed

(2) If v1 = 0, v2 ≠ 0, then indeed x = p1 and y is free to vary.

x − p1 y − p2 = . v1 v2

(3) If v1 ≠ 0, v2 = 0, then indeed x is free to vary and y = p2 .

Page 862, Table of Contents

www.EconsPhDTutor.com

82.6

3D Geometry

Fact 35 (reproduced from p. 311). The line with vector equation r = (p1 , p2 , p3 ) + λ(v1 , v2 , v3 ) (for λ ∈ R) is the line with cartesian equations as given by the 7 cases below. x − p1 y − p2 z − p3 = = , v1 v2 v3

(1)

(2) (3) (4) (5) (6) (7)

if v1 , v2 , v3 ≠ 0;

y − p2 z − p3 = , v2 v3

x = p1 ,

if v1 = 0, v2 , v3 ≠ 0;

x − p1 z − p3 = , v1 v3

y = p2 ,

if v2 = 0, v1 , v3 ≠ 0;

x − p1 y − p2 = , v1 v2

z = p3 ,

if v3 = 0, v1 , v2 ≠ 0;

x = p1 , y = p2 , z is free,

if v1 , v2 = 0, v3 ≠ 0;

x = p1 , z = p3 , y is free,

if v1 , v3 = 0, v2 ≠ 0;

y = p2 , z = p3 , x is free,

if v2 , v3 = 0, v1 ≠ 0.

Proof. Write x = p1 + λv1 , y = p2 + λv2 , and z = p3 + λv3 . Taking v2 × = minus v1 × = yields: 1

2

3

1

v2 x − v1 y = v2 p1 + λv1 v2 − v1 p2 − λv1 v2 = v2 p1 − v1 p2 .

2

Rearrange the above into v2 x−v1 y+v1 p2 −v2 p1 = 0. Similarly, taking v3 × = minus v2 × = yields 5 3 1 6 v3 y −v2 z +v2 p3 −v3 p2 = 0. Finally, taking v1 × = minus v3 × = yields v1 z −v3 x+v1 p3 −v3 p1 = 0. 4

(1) If v1 , v2 , v3 ≠ 0, then = and = become 4

5

3

y − p2 5 z − p3 x − p 1 4 y − p2 = and = . v1 v2 v2 v3

(2) If instead v1 = 0 but v2 , v3 ≠ 0, then = and = become x = p1 and 4

2

5

4

I omit the proofs of cases (3) and (4), which are very similar.

y − p 2 5 z − p3 = . v2 v3

(5) If v1 ≠ 0 but v2 , v3 = 0, then = and = become y = p2 and z = p3 . 4

6

4

6

I omit the proofs of cases (6) and (7), which are very similar.

Page 863, Table of Contents

www.EconsPhDTutor.com

Proposition 8 (reproduced from p. 332). Given a point a and a line r = p + λv (for λ ∈ R), √ 2 → 2 − (Ð →⋅v ˆ ) ; and (a) The distance between the point and the line is ∣Ð pa∣ pa →⋅v ˆ) v ˆ. (b) The foot of the perpendicular from the point to the line is the point p + (Ð pa

Proof. Consider the point a = (a1 , a2 , a3 ) and the line

r = p + λv = (p1 + λv1 , p2 + λv2 , p3 + λv3 ) .

The distance of the point to the line is

√ √ 2 2 2 2 → 2 − 2λ ∣Ð → ∣v∣. (a1 − p1 − λv1 ) + (a2 − p2 − λv2 ) + (a3 − p3 − λv3 ) = λ2 ∣v∣ + ∣Ð pa∣ pa∣

So the minimum point is given by

d 2 → 2 − 2λ ∣Ð → ∣v∣] = 2λ ∣v∣2 − 2 ∣Ð → ∣v∣ set [λ2 ∣v∣ + ∣Ð pa∣ pa∣ pa∣ = 0 dλ → ˆ⋅Ð v ⋅ (a − p) v pa ⇐⇒ λ = = . 2 ∣v∣ ∣v∣ (a) Hence, the distance between the point and the line is ¿ √ → Á ˆ⋅Ð v pa 2 2 2 → ∣v∣ Ð → Ð → Ð → Ð → Á À 2 2 ∣Ð λ ∣v∣ + ∣pa∣ − 2λ ∣pa∣ ∣v∣ = (ˆ v ⋅ pa) + ∣pa∣ − 2 pa∣ ∣v∣ √ → 2 + ∣Ð → 2 − 2 (ˆ → ∣Ð → pa) pa∣ v⋅Ð pa) pa∣ = (ˆ v⋅Ð

√ → (ˆ → → 2 + ∣Ð → 2 − 2 (ˆ = (ˆ v⋅Ð pa) pa∣ v⋅Ð pa) v⋅Ð pa) √ → 2 − (ˆ → 2 , as desired. = ∣Ð pa∣ v⋅Ð pa)

→ v = p + (ˆ → v ˆ , also as v⋅Ð pa) (b) And the foot of the perpendicular is p + λv = p + (ˆ v⋅Ð pa) ∣v∣ desired. Page 864, Table of Contents

www.EconsPhDTutor.com

Fact 40 (reproduced from p. 358). Given a plane and a line, there are three possible cases: 1. The line and plane are parallel and do not intersect at all. 2. The line and plane are parallel and the line lies completely on the plane. 3. The line and plane are not parallel and intersect at exactly one point. Proof. Let the line be r = p + λv (λ ∈ R) and the plane be r ⋅ n = d.

We look for the points on the line that are also on the plane. To do so, we look for the values of λ for which ÐÐÐÐÐ→ (p + λv) ⋅ n = d p ⋅ n + λv ⋅ n = d

λv ⋅ n = d − p ⋅ n, 1

where the second line uses the distributivity of the scalar product. Case #1. The plane and line are not parallel. Then v ⋅ n ≠ 0 and we can divide both sides by v ⋅ n to get λ=

d−p⋅n . v⋅n

Thus, the line and plane intersect at a single point, namely p+

Case #2. The plane and line are parallel.

d−p⋅n v. v⋅n

Then v ⋅ n = 0. And so either = is true for all λ — when p ⋅ n = d — or = is true for no λ — when p ⋅ n ≠ d. That is, either the line is completely on the plane or it does not intersect the plane at all. 1

Page 865, Table of Contents

1

www.EconsPhDTutor.com

Fact 41 (reproduced from p. 360). Two non-parallel planes with normal vectors n1 and n2 intersect at all if and only if they intersect along a line with direction vector n1 × n2 (i.e. the line is perpendicular to both n1 and n2 ). Proof. Let the two planes be r ⋅ n1 = d1 and r ⋅ n2 = d2 .

( ⇐Ô ) Trivial — if they intersect along such a line, then of course they intersect.

( Ô⇒ ) Suppose the two planes intersect at some point p. So p is on both planes and we 1 have p ⋅ n1 = d1 and p ⋅ n2 = d2 . Our goal is to show that a point q is on both planes (i.e. q ⋅ n1 = d1 and q ⋅ n2 = d2 ) if and only if q = p + λ (n1 × n2 ) (for λ ∈ R). That is, the points of intersection are exactly those points along the line r = p + λ (n1 × n2 ) (for λ ∈ R).

Any point can be written as q = p + λ (n1 × n2 ) + µv, where λ ∈ R and v is some vector that is not perpendicular to n1 . Then q ⋅ n1 = (p + λ (n1 × n2 ) + µv) ⋅ n1 = p ⋅ n1 + λ (n1 × n2 ) ⋅ n1 + µv ⋅ n1 = d1 + λ (n1 × n2 ) ⋅ n1 + µv ⋅ n1 = d1 + µv ⋅ n1

Distributivity of scalar product Using = ∵ (n1 × n2 ) ⊥ n1 1

Since v is not perpendicular to n1 , q ⋅ n1 = 0 if and only if µ = 0. In other words, q is on the first plane if and only if q = p + λ (n1 × n2 ).

Similarly, point can be written as q = p + λ (n1 × n2 ) + γu, where λ ∈ R and u is some vector that is not perpendicular to n2 . Then ... (similar to above) ... Since u is not perpendicular to n2 , q ⋅ n2 = d2 if and only if γ = 0. In other words, q is on the second plane if and only if q = p + λ (n1 × n2 ).

Altogether, q is on both planes if and only if q = p + λ (n1 × n2 ).

Page 866, Table of Contents

www.EconsPhDTutor.com

Fact 88. Let two planes be described by the following cartesian equations: ax + by + cz + d = 0, ex + f y + gz + h = 0. If the two planes are not parallel (i.e. (a, b, c) cannot be written as a scalar multiple of (e, f, g)), then they share at least one point of intersection.

Proof. Pick any (i1 , i2 , i3 ) such that ai1 + bi2 + ci3 = 0 but ei1 + f i2 + gi3 ≠ 0. (This vector exists because of the assumption that (a, b, c) cannot be written as a scalar multiple of (e, f, g).) Pick any (j1 , j2 , j3 ) such that aj1 + bj2 + cj3 + d = 0. Then the following point lies on both planes:

as can be easily verified:

(j1 , j2 , j3 ) −

=0

ej1 + f j2 + gj3 + h (i1 , i2 , i3 ) , ei1 + f i2 + gi3 =0

³¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ·¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹µ ej1 + f j2 + gj3 + h ³¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ·¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ µ (ai1 + bi2 + ci3 ) = 0, aj1 + bj2 + cj3 + d − ei1 + f i2 + gi3

ej1 + ej2 + ej3 + h +

Page 867, Table of Contents

ej1 + f j2 + gj3 + h (ei1 + f i2 + gi3 ) = 0. ei1 + f i2 + gi3

✓ ✓

www.EconsPhDTutor.com

Fact 42 (reproduced from p. 360). Given two planes, there are three possible cases: 1. The two planes are parallel and exactly identical. 2. The two planes are parallel and do not intersect at all. 3. The two planes are not parallel and share an intersection line with direction vector n1 × n2 (where n1 , n2 are the normal vectors of the plane).

Proof. Let the two planes be r ⋅ n1 = d1 and r ⋅ n2 = d2 .

Suppose they are parallel (i.e. n1 = cn2 for some c ∈ R). If they intersect at one point p, then both planes can be written as r ⋅ n2 = d2 and are thus exactly identical. So either they do not intersect at all or they are exactly identical. If they are not parallel, then Fact 88 shows that they intersect. Fact 41 then says that they intersect along a line with direction vector n1 × n2 .

Page 868, Table of Contents

www.EconsPhDTutor.com

83

Appendices for Part IV: Complex Numbers

Fact 89. Let b > 0. Then (a) the two square roots of a+bi (i.e. the solutions to the equation x2 = a + bi) are √ √ √√ √ 2 2 2 ( ± a +b +a+i a2 + b2 − a) . 2

(b) And the two square roots of a − bi (i.e. the solutions to the equation x2 = a − bi) are √ √ √ √ √ 2 2 2 ( a + a − b − i a − a2 − b2 ) . ± 2

Proof.

√

2 √√ √√ 2 ( a2 + b2 + a + i a2 + b2 − a)] (a) [± 2 √ √ √ √ 1 √ 2 2 = [ a + b + a − ( a2 + b2 + a) + 2i ( a2 + b2 + a) ( a2 + b2 − a)] 2 √ 1 = [2a + 2i a2 + b2 − a2 ] 2 √ = a + i b2 = a + ib.

√ √ 2 √ √ √ 2 (b) [± ( a + a2 − b2 − i a − a2 − b2 )] 2 √ √ √ √ √ 1 = [a + a2 − b2 + a − a2 + b2 − 2i (a + a2 − b2 ) (a − a2 − b2 )] 2 √ 1 = [2a − 2i a2 − (a2 − b2 )] 2 √ = a − i b2 = a − ib.

Page 869, Table of Contents

www.EconsPhDTutor.com

∗

Lemma 3. (a) (p + q) = p∗ + q ∗ . (b) (p × q) ∗ = p∗ × q ∗ .

Proof. Let p = a + bi and q = c + di.

(a) Then (p + q) ∗ = a + c − (b + d)i. And p∗ + q ∗ = a − bi + c − di = a + c − (b + d)i, so that indeed (p + q) ∗ = p∗ + q ∗ .

(b) Also, (p × q) ∗ = [(a + bi)(c + di)] ∗ = [ac − bd + (ad + bc)i] ∗ = ac − bd − (ad + bc)i. And p∗ × q ∗ = (a − bi)(c − di) = ac − bd + (ad + bc)i, so that indeed (p × q) ∗ = p∗ × q ∗ .

Theorem 5 (reproduced from p. 386). If a + bi solves an xn + an−1 xn−1 + an−2 xn−2 + ⋅ ⋅ ⋅ + a1 x + a0 = 0 (where all ak are real), then so does a − bi. Proof. Write an x + an−1 x n

n−1

+ an−2 x

n−2

n

+ ⋅ ⋅ ⋅ + a1 x + a0 = ∑ ak xk . Since a + bi solves an xn + k=0

n

an−1 xn−1 + an−2 xn−2 + ⋅ ⋅ ⋅ + a1 x + a0 = 0, we have ∑ ak (a + bi)k = 0. Observe that k=0

n

[ ∑ ak (a + bi)k ] ∗ = 0∗ = 0. k=0

n

Now repeatedly use Lemma 3 (above) to show that the LHS expression equals ∑ ak (a − bi) : k

k=0

[ ∑ ak (a + bi) ] = ∑ [ak (a + bi) ] = ∑ ak ∗ [(a + bi)k ] ∗ n

k=0 n

k ∗

n

k=0 n

k ∗

n

k=0 n

== ∑ ak [(a + bi)k ] ∗ = ∑ ak [(a + bi) ∗ ] = ∑ ak (a − bi) . k=0

k=0

k

k

k=0

Theorem 6 (Euler Formula, reproduced from p. cos θ + i sin θ.

394.)

For any θ ∈ R, eiθ =

Proof. Let f ∶ R → C be defined by f (θ) = e−iθ (cos θ + i sin θ). Now take the derivative.87

f ′ (θ) = (−i)e−iθ (cos θ + i sin θ) + e−iθ (− sin θ + i cos θ) = 0. Since the only functions whose derivatives are zero are constant functions, f (θ) = C for some constant C.

Thus, e−iθ (cos θ + i sin θ) = C or cos θ + i sin θ = Ceiθ . Plugging in θ = 0 reveals that C = 1 and yields the desired result. 87

We’re actually cheating a little with this proof here, because we haven’t proven how we can take derivatives of complexvalued functions.

Page 870, Table of Contents

www.EconsPhDTutor.com

Fact 90. Let θ ∈ (−π, π] and φk =

θ + 2kπ . Then φk ∈ (−π, π] for n

n−1 , if n is odd; 2 n 2. k = 0, ±1, ±2, . . . , − , if n is even AND θ > 0; 2 n 3. k = 0, ±1, ±2, . . . , , if n is even AND θ ≤ 0. 2

1. k = 0, ±1, ±2, . . . , ±

Proof. In each case, examine the largest and smallest φk . If they are both in the interval (−π, π], then every φk is also in the interval (−π, π]. 1. n is odd.

Since θ ∈ (−π, π], we have

θ π π ∈ (− , ). n n n

θ 2(n − 1) π θ π π π π π π + = + (n − 1) ∈ (− + (n − 1) , + (n − 1) ] = (−(n − 2) , π]. n 2 n n n n n n n n And so indeed max φk ∈ (−π, π]. max φk =

θ 2(n − 1) π θ π π π π π π − = − (n − 1) ∈ (− − (n − 1) , − (n − 1) ] = (−π, (2 − n) ]. n 2 n n n n n n n n And so indeed min φk ∈ (−π, π].

min φk =

2. n is even AND θ > 0. Since θ ∈ (0, π], we have max φk =

min φk =

π θ ∈ (0, ]. n n

θ n π θ − 2π π π + 2 ( − 1) = + π ∈ (−2 , π − ]. And so indeed max φk ∈ (−π, π]. n 2 n n n n

θ nπ θ π −2 = − π ∈ (−π, − π]. And so indeed min φk ∈ (−π, π]. n 2n n n

3. n is even AND θ ≤ 0.

Since θ ∈ (−π, 0], we have max φk = min φk =

θ π ∈ (− , 0]. n n

θ nπ θ π +2 = + π ∈ (π − , π]. And so indeed max φk ∈ (−π, π]. n 2n n n

θ n π θ + 2π π π − 2 ( − 1) = − π ∈ (−π + , −π + 2 ]. And so indeed min φk ∈ (−π, π]. n 2 n n n n

Page 871, Table of Contents

www.EconsPhDTutor.com

84

Appendices for Part V: Calculus

In this chapter, I sometimes use the symbols ∀ (“for all”) and ∃ (“there exist(s)”).

84.1

Limits Formally Defined

Informally, “lim f (x) = L” means “For all values of x that are close to but not equal to a, x→a f (x) is close to (or possibly even equal to) L.” Formally:

Definition 141. Let f be a real function on a real variable. Let L ∈ R. We say that the limit of f (x) as x approaches a is L if: ∀ > 0, ∃δ > 0 ∶ x ∈ (a − δ, a + δ)/{a} Ô⇒ f (x) ∈ (L − , L + ).

We may denote this by “as x → a, f (x) → L” or “lim f (x) = L”. x→a

I now discuss the above formal definition. You and I play a game. I let you pick any tiny but positive number 1 . I guarantee that f (x) is within a distance 1 of the limit L, so long as x is within a distance δ1 of the number a (but not equal to a), where I am free to pick δ1 > 0 after you’ve told me your choice of 1 . We try this and it works out nicely. But you are not impressed. You say that, “Well, 1 was not such a tiny number after all. I don’t believe you can do better than that.” At this point, I invite you to pick any even tinier but still positive number 2 . Again, I can I guarantee that f (x) is within a distance 2 of the limit L, so long as x is within a distance δ2 of the number a (but not equal to a), where again I am free to pick δ2 > 0 after you’ve told me your choice of 2 . We again try this and again it works out nicely. You’re still not impressed and you look for some even small positive number 3 .

It turns out we can play this game forever. Regardless of whatever tiny (but always positive) number you pick, I can guarantee that f (x) is within a distance of L, so long as x is within a distance δ of the number a (but not equal to a), where I am free to pick δ > 0 after you’ve told me your choice of . This is an example of how mathematical definitions are formed. First we have some intuitive, informal notion in mind (in this case a limit). Then with a little work, we write down a formal, precise, rigorous definition to formalise our informal notion. Our rigorous definition leaves no room for ambiguity or alternative interpretations. Let’s now revisit our earlier examples, but now using our formal definition. Page 872, Table of Contents

www.EconsPhDTutor.com

Example 89 (revisited). Consider the function f ∶ R → R defined by x ↦ 5x + 2. We now prove rigorously, using the formal definition of a limit, that lim f (x) = 17. x→3

Let > 0. Pick δ = . Then it is indeed true that for all x ∈ (3 − , 3 + ) but x ≠ 3, we 5 5 5 have f (x) = 5x + 2 ∈ (15 − 5 − 2, 15 + 5 + 2) = (17 − , 17 + ). So indeed lim f (x) = 17. x→3 5 5 The game here is to figure out, based on your choice of , what δ I should pick in order that x ∈ (a − δ, a + δ) /{a} implies f (x) ∈ (L − , L + ).

Example 90 (revisited). Consider the function g ∶ (−∞, 3) ∪ (3, ∞) → R defined by x ↦ 5x + 2. We prove that lim g(x) = 17. x→3

Let > 0. Pick δ = . Then it is indeed true that for all x ∈ (3 − , 3 + ) /{3}, we have 5 5 5 g(x) = 5x + 2 ∈ (15 − 5 − 2, 15 + 5 + 2) = (17 − , 17 + ). So indeed lim g(x) = 17. x→3 5 5

Example 91 (revisited). Consider the function h ∶ R → R defined by x ↦ 5x + 2 for x ≠ 3 and h(3) = 0. We prove that lim h(x) = 17. x→3

Let > 0. Pick δ = . Then it is indeed true that for all x ∈ (3 − , 3 + ) /{3}, we have 5 5 5 h(x) = 5x + 2 ∈ (15 − 5 − 2, 15 + 5 + 2) = (17 − , 17 + ). So indeed lim h(x) = 17. x→3 5 5

In the next example, δ need not depend on . Indeed, δ can simply be any positive number! This is because of the peculiar way the function is defined. Example 92 (revisited). Consider the function i ∶ R → R defined by x ↦ 0 for x ≠ 3 and i(3) = 17. We prove that lim f (x) = 0. x→3

Let > 0. Pick δ > 0 (that is, pick δ to be any positive number). Then it is indeed true that for all x ∈ (3 − δ, 3 + δ) /{3}, we have f (x) = 0 ∈ (0 − , 0 + ). So indeed lim f (x) = 0. x→3

Page 873, Table of Contents

www.EconsPhDTutor.com

84.2

Left- and Right-Sided Limits

Definition 142. Let f be a real function on a real variable. Let L ∈ R. We say that the left-sided limit of f (x) as x approaches a is L if: ∀ > 0, ∃δ > 0 ∶ x ∈ (a − δ, a)/{a}, Ô⇒ f (x) ∈ (L − , L + ).

We may denote this by “as x ↗ a, f (x) → L” or “lim f (x) = L”. x↗a

Definition 143. Let f be a real function on a real variable. Let L ∈ R. We say that the right-sided limit of f (x) as x approaches a is L if: ∀ > 0, ∃δ > 0 ∶ x ∈ (a, a + δ)/{a}, Ô⇒ f (x) ∈ (L − , L + ).

We may denote this by “as x ↘ a, f (x) → L” or “lim f (x) = L”. x↘a

The following fact is immediate from the definitions. Fact 91. lim f (x) = L ⇐⇒ lim f (x) = L AND lim f (x) = L. x→a

Page 874, Table of Contents

x↗a

x↘a

www.EconsPhDTutor.com

84.3

Infinite Limits and Vertical Asymptotes

Definition 144. (Two-Sided Infinite Limit.) We say that the limit of f (x) as x approaches a is ∞ if: ∀P ∈ R, ∃δ > 0 ∶ x ∈ (a − δ, a + δ)/{a} Ô⇒ f (x) > P .

We may denote this by “as x → a, f (x) → ∞” or “lim f (x) = ∞”. x→a

Definition 145. (Left-Sided Infinite Limit.) We say that the limit of f (x) as x approaches a from the left is ∞ if: ∀P ∈ R, ∃δ > 0 ∶ x ∈ (a − δ, a) Ô⇒ f (x) > P .

We may denote this by “as x ↗ a, f (x) → ∞” or “lim f (x) = ∞”. x↗a

The definition of the right-sided infinite limit (“lim f (x) = ∞”) is very similar and thus omitted. Likewise with the definitions where the limit is −∞ (instead of ∞). x↘a

(Note that where I write x ↗ a, some others instead write x ↑ a or x → a− . And where I write x ↘ a, some others instead write x ↓ a or x → a+ .) Definition 146. (Vertical asymptote.) We say that x = a is a vertical asymptote of the graph of f if lim f (x) = ±∞ or lim f (x) = ±∞. x↗a

Page 875, Table of Contents

x↘a

www.EconsPhDTutor.com

84.4

Limits at Infinity, Horizontal, and Oblique Asymptotes

Definition 147. (Limit at Infinity.) We say that the limit of f (x) as x approaches ∞ is L and write “as x → ∞, f (x) → L” or “ lim f (x) = L” if: x→∞

∀ > 0, ∃P ∈ R ∶ x > P Ô⇒ f (x) ∈ (L − , L + ).

The analogous definition where x approaches −∞ (instead of ∞) is omitted. Definition 148. (Horizontal asymptote.) We say that y = L is a horizontal asymptote of the graph of f if lim f (x) = L OR

x→∞

lim f (x) = L.

x→−∞

Definition 149. (Limit at Infinity.) We say that the limit of f (x) as x approaches ∞ is ax + b if: ∀ > 0, ∃P ∈ R ∶ x > P Ô⇒ f (x) ∈ (ax + b − , ax + b + ).

We may denote this by “as x → ∞, f (x) → ax + b” or “ lim f (x) = ax + b” x→∞

The analogous definition where x approaches −∞ (instead of ∞) is omitted.

Definition 150. (Oblique asymptote.) We say that y = ax + b is an oblique asymptote of the graph of f if at least one of the following statements is true: lim f (x) = ax + b OR

x→∞

Page 876, Table of Contents

lim f (x) = ax + b.

x→−∞

www.EconsPhDTutor.com

84.5

Limit Laws

Lemma 4. Let f, g ∶ R → R be functions. Let a, c, L, M ∈ R. Suppose lim f (x) = L and x→a lim g(x) = M . Then x→a

1. lim [f (x) ± g(x)] = L ± M , x→a

4.

1 1 = (L ≠ 0), x→a f (x) L lim

2.

5.

lim [kf (x)] = kL,

x→a

g(x) M = (L ≠ 0). x→a f (x) L lim

3. lim [f (x)g(x)] = LM , x→a

Proof. We first write down what the statements lim f (x) = L and lim g(x) = M mean: x→a

x→a

• lim f (x) = L ⇐⇒ ∀f > 0, ∃δf > 0 ∶ x ∈ (a − δf , a + δf )/{a} Ô⇒ f (x) ∈ (L − f , L + f ). x→a

• lim g(x) = M ⇐⇒ ∀g > 0, ∃δg > 0 ∶ x ∈ (a − δg , a + δg )/{a} Ô⇒ g(x) ∈ (M − g , M + g ). x→a

1. Let 1 > 0. Pick small f , g so that f + g ≤ 1 . Pick the δf , δg that correspond to these f , g . Now pick δ1 = min {δf , δg }, so that indeed x ∈ (a − δ1 , a + δ1 )/{a} implies f (x) + g(x) ∈ (L + M − (f + g ) , L + M + (f + g )) ⊆ (L + M − 1 , L + M + 1 ) . The proof that lim [f (x) − g(x)] = L − M is very similar and omitted. x→a

2. Let 2 > 0. Pick small f so that kf ≤ 2 . Pick the δf that corresponds to this f . Then indeed x ∈ (a − δf , a + δf )/{a} implies kf (x) ∈ (k (L − f ) , k (L + f )) ⊆ (kL − 2 , kL + 2 ) . 3. Let 3 > 0. Case #1. Assume L, M > 0.

Pick small f , g so that f M + g L + f g ≤ 3 and L − f , M − g > 0. Pick the corresponding δf , δg . Now pick δ3 = min {δf , δg }, so that indeed x ∈ (a − δ3 , a + δ3 )/{a} implies f (x)g(x) ∈ ((L − f ) (M − g ) , (L + f ) (M + g )) = (LM − f M − g L + f g , LM + f M + g L + f g ) ⊆ (LM − 3 , LM + 3 ) .

The proof for the other cases (where L, M are not both positive) is similar and omitted.

(... Proof continued on the next page ...)

Page 877, Table of Contents

www.EconsPhDTutor.com

(... Proof from the previous page ...) 1 4 L2 4. Let 4 ∈ (0, ∣ ∣). Pick f = . Pick δf that corresponds to this f . First, L 1 + 4 L 1 4 ∈ (0, ∣ ∣) L

⎧ ⎪ ⎪ 1 ⎪≤ 1, Ô⇒ ⎨ 1 + 24 L ⎪ ⎪ ⎪ ⎩≥ 1, ⇐⇒

Ô⇒ 1 −

if L > 0,

if L < 0

L ≤L 1 + 24 L

4 L 1 ≥ 1 − 4 L. 1 + 24 L

⎧ ⎪ ⎪ ⎪≥ 1, Ô⇒ 1 + 24 L ⎨ ⎪ ⎪ ⎪ ⎩≤ 1,

⎧ ⎪ ⎪ L ⎪≤ L, Ô⇒ ⎨ 1 + 24 L ⎪ ⎪ ⎪ ⎩≤ L, Ô⇒ 4 L ≥

if L > 0, if L < 0

if L > 0, if L < 0

4 L 1 + 24 L

Now, indeed x ∈ (a − δf , a + δf )/{a} Ô⇒ f (x) ∈ (L − f , L + f ) Ô⇒ 1 1 1 ∈( , ) f (x) L + f L − f

=

⎛ ⎞ 1 1 , 2 2 4 L 4 L ⎠ ⎝ L + 1+ L − 1+ 4L 4L

1 1 + 4 L 1 1 4 L 1 =( ( ) , + 4 ) = ( (1 − ) , + 4 ) L 1 + 24 L L L 1 + 24 L L 2 1 1 ⊆ ( (1 − 4 L) , + 4 ) L L

where ⊆ uses ≥. 2

1

=(

=(

1 + 4 L 1 + 4 L , ) L + 24 L2 L

1 1 − 4 , + 4 ) , L L

5 follows directly from 3 and 4.

Page 878, Table of Contents

www.EconsPhDTutor.com

Theorem 17. (Squeeze or Sandwich or Pinching Theorem.) Let f, g, h ∶ R → R be functions. Suppose lim f (x) = L and lim g(x) = L. If ∃δ > 0 ∶ x ∈ (a − δ, a + δ)/{a} Ô⇒ x→a x→a f (x) ≤ h(x) ≤ g(x), then lim h(x) = L. x→a

Proof. Let > 0. Pick any f , g to be small enough so that f + g ≤ . Pick the δf , δg that correspond to these f , g . Now pick δ1 = min {δ, δf , δg }, so that indeed for all x ∈ (a − δ1 , a + δ1 )/{a}, we have h(x) ∈ (L − f , L + f ) ∩ (L − g , L + g ) = (L − min {f , g } , L + min {f , g }) ⊆ (L − , L + ).

Page 879, Table of Contents

www.EconsPhDTutor.com

84.6

Continuity

Definition 151. A function f is left-continuous at a point a if lim f (x) = f (a) and rightcontinuous at a if lim f (x) = f (a).

x↗a

x↘a

Fact 92. A function f is continuous at a point a ⇐⇒ f is both left- and right-continuous at a. Proof. This is obvious from Fact 91 and the definitions of left- and right-continuity. Definition 152. A function f is continuous on the open interval (a, b) if it is continuous at every point x ∈ (a, b).

Definition 153. A function f is continuous on the closed interval [a, b] if it is continuous at every point x ∈ (a, b), right-continuous at a, and left-continuous at b.

Definition 154. A function f is continuous on the interval (a, b] if it is continuous at every point x ∈ (a, b) and left-continuous at b.

Definition 155. A function f is continuous on the interval [a, b) if it is continuous at every point x ∈ (a, b) and right-continuous at a.

Page 880, Table of Contents

www.EconsPhDTutor.com

84.7

Differentiation f (x) − f (a) exists x↗a x−a

Definition 156. A function f is left-differentiable at a point a if lim f (x) − f (a) exists. x↘a x−a

and right-continuous at a if lim

f (x) − f (a) f (x) − f (a) = lim . x↗a x↘a x−a x−a

Fact 93. A function f is differentiable at a point a ⇐⇒ lim

Proof. This is obvious from Fact 91 and the definitions of left- and right-differentiability. Definition 157. A function f is differentiable on the open interval (a, b) if it is differentiable at every point x ∈ (a, b).

Definition 158. A function f is differentiable on the closed interval [a, b] if it is differentiable at every point x ∈ (a, b), right-differentiable at a, and left-differentiable at b. Definition 159. A function f is differentiable on the interval (a, b] if it is differentiable at every point x ∈ (a, b) and left-differentiable at b. Definition 160. A function f is differentiable on the interval [a, b) if it is differentiable at every point x ∈ (a, b) and right-differentiable at a.

Page 881, Table of Contents

www.EconsPhDTutor.com

sin θ = 1. θ→0 θ

Lemma 5. lim

Proof. Consider the circle (from which the trigonometric functions and the radian were defined). First restrict attention to θ ∈ (0, 0.5π).

Clearly, for all θ ∈ (0, 0.5π), BC < arcAB. But by definition of the radian and the sine function, we have θ = arc AB and sin θ = BC. Thus, sin θ < θ.

Clearly, the area of the △OAD is greater than the area of the circular sector OAB. That θ is, 0.5 tan θ > π12 × = 0.5θ. Or tan θ > θ. 2π

1 1 cos θ Altogether then, for all θ ∈ (0, 0.5π), sin θ < θ < tan θ. Rearranging to get < < . sin θ θ sin θ sin θ Multiply by sin θ to get 1 < < cos θ. θ We can show this last pair of inequalities also holds for all θ ∈ (−0.5π, 0).

sin θ = 1 (Squeeze Theorem). θ→0 θ

Since lim 1 = 1 and lim cos θ = 1, we have lim θ→0

θ→0

Page 882, Table of Contents

www.EconsPhDTutor.com

Proposition 1 (reproduced from p. 122). Let f ∶ A → R and g ∶ B → R be differentiable functions with derivatives f ′ and g ′ . Suppose also that the composite function f g ∶ A → R is well-defined. Let k ∈ R be a constant. Then: d dx

k

=

0,

=

kf ′ ,

=

ex ,

d f ± g = f ′ ± g′, dx d dx

kf

d dx

xk

d dx

ex

= kxk−1 ,

d sin x = dx

d cos x = dx d dx d dx

f ⋅g f g

=

=

cos x,

− sin x,

d 1 ln x = , dx x

g ⋅ f ′ + f ⋅ g′,

g ⋅ f ′ − f ⋅ g′ , g⋅g

d d (f ○ g) dg f ○g = ⋅ . dx dg dx

Proof. Constant. Let f ∶ R → R be defined by x ↦ k, where k ∈ R is a constant. Then for f (x) − f (a) f (x) − f (a) k−k all a ∈ R, lim exists and moreover lim = lim = 0. x→a x→a x→a x − a x−a x−a Hence, the derivative of f is the function f ′ ∶ R → R defined by x ↦ 0. In shorthand, we may write dk/dx = 0.

Sum and Difference. f ± g is the function with domain A ∩ B, codomain R, and mapping f (x) ± g(x) − [f (a) ± g(a)] rule x ↦ f (x) ± g(x). For every a ∈ A ∩ B, lim exists, with x→a x−a f (x) − f (a) g(x) − g(a) f (x) ± g(x) − [f (a) ± g(a)] = lim ( ± ) x→a x→a x−a x−a x−a f (x) − f (a) g(x) − g(a) = lim ± lim = f ′ (x) ± g ′ (a), x→a x→a x−a x−a lim

where the penultimate = used Lemma 4.1. Hence, the derivative of f ± g is the function f ′ ± g ′ with domain A ∩ B, codomain R, and mapping rule x ↦ f ′ (x) ± g ′ (x). We can write this in shorthand as d (f ± g) /dx = f ′ ± g ′ . ⎛³¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹k¹ ¹ ¹ ¹ ¹ ¹ ¹times ⎞ ³¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹k¹ ¹ ¹ ¹ ¹ ¹ times ¹ ¹ ¹·¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ µ ¹ ¹ ·¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ µ d d ⎜ ⎟ = f ′ + f ′ + f ′ + ⋅ ⋅ ⋅ + f ′ = kf ′ . Constant Multiple. (kf ) = f + f + f + ⋅ ⋅ ⋅ + f ⎟ dx dx ⎜ ⎝ ⎠

(... Proof continued on the next page ...) Page 883, Table of Contents

www.EconsPhDTutor.com

(... Proof continued from the previous page ...) Sine. Let h be the function with domain R, codomain R, and mapping rule x ↦ sin x. For h(x) − h(a) every a ∈ R, lim exists and moreover x→a x−a x−a 2 cos x+a h(x) − h(a) sin x − sin a 2 sin 2 lim = lim = lim x→a x→a x→a x−a x−a x−a

sin x−a x+a x+a 2 lim x−a2 = lim cos =lim cos x→a x→a 2 x→a 2 2 1

= cos a. 3

where = and = used Lemmata 4 and 5, and = uses the fact that the cosine function is continuous (admittedly we haven’t proven this yet, but this should be “obvious”). Hence, the derivative of h is the function with domain R, codomain R, and mapping rule x ↦ cos x. We can write this in shorthand as d sin x/dx = cos x. 1

2

3

Cosine. d cos x/dx = d sin (x + π/2) /dx = cos (x + π/2) = − sin x.

Product Rule. f ⋅ g is the function with domain A ∩ B, codomain R, and mapping rule f (x)g(x) − f (a)g(a) exists and moreover x ↦ f (x)g(x). For every a ∈ A ∩ B, lim x→a x−a f (x)g(x) − f (a)g(a) x→a x−a lim

f (x)g(x) − f (x)g(a) + f (x)g(a) − f (a)g(a) x→a x−a g(x) − g(a) f (x) − f (a) = lim [f (x) + g(a) ] x→a x−a x−a = lim

= lim [f (x) 1

x→a

g(x) − g(a) f (x) − f (a) ] + lim [g(a) ] x→a x−a x−a

g(x) − g(a) f (x) − f (a) ] + g(a) lim x→a x→a x−a x−a

= [lim f (x)] [lim 2

x→a

= f (a)g ′ (a) + g(a)f ′ (a),

where = and = used Lemma 4. Hence, the derivative of f ⋅ g is the function f g ′ + gf ′ with domain A∩B, codomain R, and mapping rule x ↦ f g ′ +gf ′ . We can write this in shorthand as d (f ⋅ g) /dx = f g ′ + gf ′ . 1

2

(... Proof continued on the next page ...) Page 884, Table of Contents

www.EconsPhDTutor.com

(... Proof continued from the previous page ...) Chain Rule. We are tempted to simply write the following “proof”: ′

f (g(x)) − f (g(a)) f (g(x)) − f (g(a)) g(x) − g(a) = lim [ ] x→a x→a x−a g(x) − g(a) x−a

(f ○ g) (a) = lim

f (g(x)) − f (g(a)) g(x) − g(a) lim = f ′ (g(a)) g ′ (a). x→a x→a g(x) − g(a) x−a

= lim

However, the above “proof” commits the cardinal sin of (possibly) dividing by zero, because there is the possibility that g(x) = g(a) for values of x in the neighbourhood of a!

To get around this irksome technicality, we need to play a little trick. Define ⎧ f (g(x)) − f (g(a)) ⎪ ⎪ ⎪ , ⎪ g(x) − g(a) φ(x) = ⎨ ⎪ ⎪ ′ ⎪ ⎪ ⎩f (g(a)) ,

Note that

if g(x) ≠ g(a),

if g(x) = g(a).

⎧ f (g(x)) − f (g(a)) ⎪ ⎪ lim = f ′ (g(a)) , ⎪ 0 ⎪x→a g(x) − g(a) lim φ(x) = ⎨ x→a ⎪ ⎪ ⎪ lim f ′ (g(a)) = f ′ (g(a)) , ⎪ ⎩x→a

= will be used shortly. Now, observe that 0

f (g(x)) − f (g(a)) 1 g(x) − g(a) = φ(x) , x−a x−a

if g(x) ≠ g(a), if g(x) = g(a).

because if g(x) = g(a), then = is clearly true; and if g(x) ≠ g(a), then = is again clearly true, because 1

So

′

1

f (g(x)) − f (g(a)) g(x) − g(a) =0= . x−a x−a

f (g(x)) − f (g(a)) 1 g(x) − g(a) = lim [φ(x) ] x→a x→a x−a x−a g(x) − g(a) 0 ′ = lim φ(x) lim = f (g(a)) g ′ (a). x→a x→a x−a

(f ○ g) (a) = lim

(... Proof continued on the next page ...) Page 885, Table of Contents

www.EconsPhDTutor.com

(... Proof continued from the previous page ...) d f d 1 × ′1 d 1 P,C f ′ 1 ′ gf ′ − f g ′ Quotient Rule. = (f ) = f + f = + f (−1) g = . dx g dx g g dx g g g⋅g g⋅g Natural Logarithm. See Fact 100.

d d d 1 d Exponential. On the one hand, ln ex = x = 1. On the other, ln ex = x ex . dx dx dx e dx 1 d x d x x Hence, x e = 1. Rearranging, e =e . e dx dx Power Rule. Using the Chain Rule and also the derivatives of the natural logarithm and exponential functions, we have: d n n dxn = en ln x = en ln x = xn = nxn−1 . dx dx x x

Page 886, Table of Contents

www.EconsPhDTutor.com

84.8

Differentiability Implies Continuity

Theorem 1 (reproduced from p. 128). If f ∶ D → R is differentiable at a ∈ D, then f is continuous at a ∈ D. Proof.

lim [f (x) − f (a)] = lim [(x − a)

x→a

x→a

f (x) − f (a) ] x−a

f (x) − f (a) x→a x−a

= lim(x − a) lim x→a

So lim f (x) = f (a). x→a

Page 887, Table of Contents

= 0 ⋅ f ′ (a) = 0

www.EconsPhDTutor.com

84.9

Maximum, Minimum, and Turning Points

Definition 161. Let f ∶ D → R and x ∈ D. If ∃δ > 0: ∀a ∈ (x − δ, x + δ) ∩ D,

1. ... f (x) ≥ f (a), then x is a maximum point of f and f (x) a maximum value. 2. ... f (x) ≤ f (a), then x is a minimum point of f and f (x) a minimum value.

3. ... f (x) > f (a), then x is a strict maximum point of f and f (x) a strict maximum value.

4. ... f (x) < f (a), then x is a strict minimum point of f and f (x) a strict minimum value. Fact 10 (reproduced from p. 132). Let f ∶ R → R be a differentiable function. Let a, b ∈ R with b > a. Then 1. f is decreasing on (a, b) ⇐⇒ f ′ (c) ≥ 0, for all c ∈ (a, b). 2. f is increasing on (a, b) ⇐⇒ f ′ (c) ≤ 0, for all c ∈ (a, b).

3. f is strictly decreasing on (a, b) ⇐⇒ f ′ (c) < 0, for all c ∈ (a, b). 4. f is strictly increasing on (a, b) ⇐⇒ f ′ (c) > 0, for all c ∈ (a, b). 5. f is both increasing and decreasing at c ⇐⇒ f ′ (c) = 0.

Proof. 1. Suppose f is decreasing on (a, b). That is, by definition, ∀x1 , x2 ∈ (a, b), x2 > x1 Ô⇒ f (x2 ) ≥ f (x1 ). Equivalently, x2 − x1 > 0 Ô⇒ f (x2 ) − f (x1 ) ≥ 0. Equivalently, for all f (x) − f (c) f (x2 ) − f (x1 ) ≥ 0. This implies that ∀c ∈ (a, b), lim ≥ 0. distinct x1 , x2 ∈ (a, b), x→c x2 − x1 x−c That is, f ′ (c) ≥ 0 for all c ∈ (a, b).

Now suppose f ′ (c) ≥ 0, for all c ∈ (a, b). Then ∃δ > 0 such that ∀c ∈ (a, b), ∀x ∈ (c − δ, c + δ), f (x) − f (c) ≥ 0. Equivalently, ∀x1 , x2 ∈ (c − δ, c + δ), x2 > x1 Ô⇒ f (x2 ) ≥ f (x1 ). Since δ x−c is fixed and the previous sentence is true if we replace c with any other d ∈ (a, b), we have that ∀x1 , x2 ∈ (a, b), x2 > x1 Ô⇒ f (x2 ) ≥ f (x1 ). This completes the proof of 1. The proofs of 2, 3, and 4 are similar and thus omitted. 5. f is both increasing and decreasing at c ⇐⇒ f ′ (c) ≥ 0 AND f ′ (c) ≤ 0 ⇐⇒ f ′ (c) = 0.

Page 888, Table of Contents

www.EconsPhDTutor.com

Proposition 2 (reproduced from p. 143). (Interior Extremum Theorem [IET].) Let f ∶ D → R be a differentiable function. If a is a maximum or minimum point AND in the interior of D, then f ′ (a) = 0 (i.e. c is a stationary point). f (x) − f (a) > 0. That is, there x→a x−a f (x) − f (a) exists δ > 0 such that for all x ∈ (a − δ, a + δ)/{a}, > 0. That is, f (x) > f (a) if x−a x > a and f (x) < f (a) if x < a. So by definition then, x can neither be a maximum nor a minimum point. Proof. Suppose for contradiction that f ′ (a) > 0. That is, lim

Similarly, suppose for contradiction that f ′ (a) < 0 ... (similar reasoning omitted). We conclude that if x is a maximum or a minimum point, then f ′ (a) = 0.

Fact 94. Suppose f ∶ D → R is continuous at a. If there exists δ > 0 such that f is increasing on (a − δ, a) and f is decreasing on (a, a + δ), then f attains a maximum at a. (Similarly, if there exists δ > 0 such that f is decreasing on (a − δ, a) and f is increasing on (a, a + δ), then f attains a minimum at a.) Proof. By continuity, f (a) =

sup f (x) and f (a) =

x∈(a−δ,a)

sup f (x).88 So indeed f (a) ≥ f (x)

x∈(a,a+δ)

for all x ∈ (a − δ, a + δ) and f attains a maximum at a.

(The proof of the “similarly” bit is similar and omitted.)

88

sup f (x) is the smallest real number L such that L ≥ f (x) for all x ∈ A. x∈A

Page 889, Table of Contents

www.EconsPhDTutor.com

84.10

Concavity and Inflexion Points

These are the general definitions of concavity and inflexion points, without assuming that f is differentiable. Definition 162. A function f is concave downwards (or concave) on an interval if for every x1 , x2 in that interval and every α ∈ [0, 1], f (αx1 + (1 − α) x2 ) ≥ αf (x1 ) + (1 − α) f (x2 ).

Definition 163. A function f is concave upwards (or convex) on an interval (in the function’s domain) if for every x1 , x2 in that interval and every α ∈ [0, 1], f (αx1 + (1 − α) x2 ) ≤ αf (x1 ) + (1 − α) f (x2 ).

Definition 164. A function f is linear on an interval (in the function’s domain) if for every x1 , x2 in that interval and every α ∈ [0, 1], f (αx1 + (1 − α) x2 ) = αf (x1 ) + (1 − α) f (x2 )

Of course, if a function is linear on an interval, then it is also both concave and convex on that interval. Definition 165. Suppose f ∶ D → R is continuous at a ∈ D. Then a is an inflexion point of f if there exists δ > 0 such that 1. f is concave downwards on (a − δ, a), but concave upwards on (a, a + δ); OR 2. f is concave upwards on (a − δ, a), but concave downwards on (a, a + δ), and moreover, f is not linear on (a − δ, a + δ).

The “moreover” bit is to rule out the trivial case where f is simply linear (and thus both concave and convex) on (a − δ, a + δ). In this case, we do not want to say that a is an inflexion point.

Page 890, Table of Contents

www.EconsPhDTutor.com

The next Lemma is an alternative definition of concavity. Lemma 6. (a) f is concave downwards on an interval ⇐⇒ For every distinct x1 , x2 , x3 in that interval, f (x2 ) − f (x1 ) f (x3 ) − f (x2 ) ≥ . x2 − x1 x3 − x2

(b) f is concave upwards on an interval ⇐⇒ For every x1 , x2 , x3 in that interval, f (x2 ) − f (x1 ) f (x3 ) − f (x2 ) ≤ . x2 − x1 x3 − x2

x3 − x2 ∈ [0, 1], so that Proof. Pick any distinct x1 , x2 , and x3 in the interval. Let α = x3 − x1 x3 − x2 x2 − x1 αx1 + (1 − α) x3 = x1 + x3 = x2 . And so by Definition 162, x3 − x1 x3 − x1 ⇐⇒

f (x2 ) ≥

x2 − x1 x3 − x2 f (x1 ) + f (x3 ) x3 − x1 x3 − x1

(x3 − x1 ) f (x2 ) ≥ (x3 − x2 ) f (x1 ) + (x2 − x1 ) f (x3 )

⇐⇒ x3 [f (x2 ) − f (x1 )] ≥ x2 [f (x3 ) − f (x1 )] − x1 [f (x3 ) − f (x2 )] ⇐⇒

⇐⇒

(x3 − x2 ) [f (x2 ) − f (x1 )] ≥ x2 [f (x3 ) − f (x1 )] − x1 [f (x3 ) − f (x2 )] − x2 [f (x2 ) − f (x1 )] = x2 f (x3 ) − x1 [f (x3 ) − f (x2 )] − x2 f (x2 ) = (x2 − x1 ) [f (x3 ) − f (x2 )]

f (x2 ) − f (x1 ) f (x3 ) − f (x2 ) ≥ . x2 − x1 x3 − x2

This completes the proof of (a). The proof of (b) is similar and thus omitted.

Page 891, Table of Contents

www.EconsPhDTutor.com

84.11

Concavity and Inflexion Points with Differentiability

Fact 95. Let f ∶ D → R be differentiable at a ∈ D. Then a is an inflexion point ⇐⇒ ∃δ > 0 such that one of the following is true: (a) ∀x ∈ (a−δ, a), f ′ (a) (x − a)+f (a) ≥ f (x) and ∀x ∈ (a, a+δ), f ′ (a) (x − a)+f (a) ≤ f (x), with at least one of these inequalities being strict.

(b) ∀x ∈ (a−δ, a), f ′ (a) (x − a)+f (a) ≤ f (x) and ∀x ∈ (a, a+δ), f ′ (a) (x − a)+f (a) ≥ f (x), with at least one of these inequalities being strict.

Here is the informal interpretation of the above fact. Note that the tangent line has equation y = f ′ (a) (x − a) + f (a). So (a) is the condition that to the left of a, the tangent line at a is above the graph of f ; but to the right of a, the tangent line is below the graph of f . And (b) is the condition that to the left of a, the tangent line at a is below the graph of f ; but to the right of a, the tangent line is above the graph of f . The additional condition that “at least one of these inequalities is strict” is to avoid the trivial situation where f is linear on the interval (a − δ, a + δ).

f (x) − f (a) x−a f (x) − f (a) f (x) − f (a) ≥ ⇐⇒ ∃δ > 0 such that ∀b ∈ (a − δ, a + δ), ⇐⇒ ∀x ∈ (a − δ, a), lim x→a x−a x−a f (b) − f (a) f (x) − f (a) ≥ ∀x ∈ (a − δ, a) ⇐⇒ f is concave on (a − δ, a) (Lemma 6). b−a x−a Proof. (a) ∀x ∈ (a−δ, a), f ′ (a) (x − a)+f (a) ≥ f (x) ⇐⇒ ∀x ∈ (a−δ, a), f ′ (a) ≥

The proof that “∀x ∈ (a, a + δ), f ′ (a) (x − a) + f (a) ≤ f (x)” ⇐⇒ “f is convex on (a − δ, a)” is similar and thus omitted. This completes the proof of (a). The proof of (b) is similar and thus omitted.

Page 892, Table of Contents

www.EconsPhDTutor.com

Proposition 2 (reproduced from p. 148). Let f ∶ D → R be a twice-differentiable function. (a) f is concave downwards on an interval ⇐⇒ f ′ is decreasing on this interval ⇐⇒ f ′′ (x) ≤ 0 for every x in this interval. (b) f is concave upwards on an interval ⇐⇒ f ′ is increasing on this interval ⇐⇒ f ′′ (x) ≥ 0 for every x in this interval. (c) If a is an inflexion point of f , then f ′′ (a) = 0.

Proof. (a) “f is concave downwards on an interval” ⇐⇒ “For every distinct x1 , x2 , x3 f (x2 ) − f (x1 ) f (x3 ) − f (x2 ) in that interval, ≥ (Lemma 6)” ⇐⇒ “For all a, b in that x2 − x1 x3 − x2 f (x) − f (b) f (x) − f (a) ≤ lim ” ⇐⇒ “f ′ is decreasing interval such that a > b Ô⇒ lim x→a x→b x−a x−b on that interval” ⇐⇒ “f ′′ (x) ≤ 0 for every x in this interval Proposition 10”. This completes the proof of (a). The proof of (b) is similar and thus omitted.

(c) By (a) and (b) and by the definition of an inflexion point, f ′ is decreasing on some interval on one side of a and increasing on some interval on the other side of a. Moreover, f ′ is continuous (since f ′′ exists). By Fact 94 then, f ′ attains a maximum or minimum at a. By the IET then, f ′′ (a) = 0.

Page 893, Table of Contents

www.EconsPhDTutor.com

Proposition 3 (reproduced from p. 151). Let f be a twice-differentiable function. Let a be a stationary point (i.e. f ′ (a) = 0). 1. If f ′′ (a) < 0, then a is a maximum point.

2. If f ′′ (a) > 0, then a is a minimum point.

3. If f ′′ (a) = 0, then a could be a maximum point, a minimum point, an inflexion point, or none of the above! Proof. 1. If f ′ (a) = 0 and f ′′ (a) < 0, then ∃δ > 0 ∶ f ′ (x) > 0 for all x ∈ (a−δ, a) and f ′ (x) < 0 for all x ∈ (a, a + δ) . And so by Fact 94, a is a maximum point. 2. The proof is similar and thus omitted.

3. We will show that all four possibilities exist. Define f ∶ R → R by x ↦ x4 . Then f ′ (0) = f ′′ (0) = 0 and 0 is a minimum point.

Define g ∶ R → R by x ↦ −x4 . Then g ′ (0) = g ′′ (0) = 0 and 0 is a maximum point. Define h ∶ R → R by x ↦ x3 . Then h′ (0) = h′′ (0) = 0 and 0 is an inflexion point. ⎧ 1 5 ⎪ ⎪ ⎪x sin , x Define i ∶ R → R by i(x) = ⎨ ⎪ ⎪ ⎪ ⎩0,

for x ≠ 0,

for x = 0.

Note that i is indeed twice-differentiable with i′ ∶ R → R and i′′ ∶ R → R defined by ⎧ 1 1 4 3 ⎪ ⎪ 5x sin − x cos , ⎪ x x i′ (x) = ⎨ ⎪ ⎪ ⎪ ⎩0,

for x ≠ 0, for x = 0,

⎧ 1 1 1 3 2 ⎪ ⎪ ⎪20x sin − 8x cos − x sin , x x x i (x) = ⎨ ⎪ ⎪ ⎪ ⎩0, ′′

for x ≠ 0,

for x = 0.

We indeed have i′ (0) = i′′ (0) = 0. However, near 0, i(x) fluctuates infinitely often between negative and positive values. So 0 is neither a maximum point nor a minimum point.

Moreover, near 0, i′′ (x) fluctuates infinitely often between negative and positive values. So there is no interval to the left of 0 on which i is concave or convex. And there is no interval to the right of 0 on which i is concave or convex. Thus, 0 is not an inflexion point.

Page 894, Table of Contents

www.EconsPhDTutor.com

84.12

Inverse Function Theorem

Theorem 18. Let a, b ∈ R and f ∶ [a, b] → R be a continuous and differentiable function, with Range(f ) = [c, d]. If f is either strictly increasing or strictly decreasing on [a, b], then

(a) f −1 ∶ [c, d] → [a, b] exists; and

′

(b) f −1 is differentiable on (c, d) with (f −1 ) ∶ (c, d) → R defined by y ↦ 1/ [f ′ (y)].

Proof. (a) Since f is either strictly increasing or strictly decreasing, it is invertible (or one-to-one) and so f −1 exists. (b) Let f −1 (y) be denoted by xy . ′

xy − xa f −1 (y) − f −1 (a) = lim = lim y→a y→a f (xy ) − f (xa ) y→a y−a

(f −1 ) (a) = lim

= lim 1

1

xy →xa f (xy )−f (xa ) xy −xa

=

1

2

limxy →xa

f (xy )−f (xa ) xy −xa

=

1 . f ′ (a)

1

f (xy )−f (xa ) xy −xa

f (xy ) − f (xa ) xy →xa xy − xa

where = uses the continuity of f and = uses a Limit Law and the fact that lim 1

exists and is not equal to 0.

Page 895, Table of Contents

2

www.EconsPhDTutor.com

84.13

Parametric Differentiation

dy dy dx In the main text above, we wrote informally that = ÷ . Here is the formal statement dx dt dt and proof of this fact: Fact 96. Let f, g ∶ R → R be differentiable functions. Let y = f (t) and x = g(t). Then for any a such that g ′ (a) ≠ 0, we have R R R dy RRRR dx RRRR dy RRRR R = RRR ÷ RRR . dx RRRR dt RR dt RR Rt=a Rt=a Rt=a

Proof.

R dy RRRR f (t) − f (a) f (t) − f (a) t − a RRR = lim = lim [ ⋅ ] t→a g(t) − g(a) t→a g(t) − g(a) t − a dx RR Rt=a = lim [ t→a

f (t) − f (a) g(t) − g(a) ÷ ] t−a t−a

R R f (t) − f (a) dx RRRR g(t) − g(a) dy RRRR = lim ÷ lim = RRR ÷ RRR . t→a t→a t−a t−a dt RR dt RR Rt=a Rt=a

Page 896, Table of Contents

www.EconsPhDTutor.com

84.14

Maclaurin Series

This continues the discussion from p. 463. Theorem 19. (Lagrange’s Remainder Theorem.) Let f ∶ [−c, c] → R be (n + 1)-times differentiable. Fix a ∈ (−c, c)/{0}. Let Mn (a) be the corresponding nth-order Maclaurin series. Then there exists b ∈ (−a, a) such that f (n+1) (b) n+1 a . f (a) = Mn (a) + (n + 1)!

Proof. Omitted, not because it’s difficult, but because the proof requires the Mean Value Theorem, which in turn requires a few other ingredients. After some thought, I’ve decided to just omit this, rather than add another 10 pages that no one will read. f (n+1) (b) n+1 We refer to the term a as Lagrange’s remainder. Observe that if Lagrange’s (n + 1)! remainder is small, then it is indeed the case that f (a) is equal to the nth-order Maclaurin series at a.

f (n+1) (b) n+1 a → 0 as n → ∞, then Mn (a) → f (a), that is, M (a) = f (a) (in (n + 1)! f (n+1) (b) n+1 a → 0 as n → ∞ is words, the Maclaurin series converges). The property that (n + 1)! precisely the “nice” property we kept referring to in the main text above. Moreover, if

Definition 166. We say that f satisfies the “nice” property at a if for all x ∈ (−a, a), f (k) (x) k lim a = 0. k→∞ n! Corollary 8. If f satisfies the “nice” property at a, then f (a) = M (a), where M (a) is the Maclaurin series at a. Now we can verify that our five standard series satisfy the “nice” property (for some specified range of values).

Page 897, Table of Contents

www.EconsPhDTutor.com

Example 626. We verify that the function g ∶ R → R defined by x ↦ (1 + x)n satisfies the “nice” property at a for all a ∈ (−1, 1): For all x ∈ (−a, a) for all k ∈ Z+ , we have g (k) (x) k n(n − 1)(n − 2) . . . (n − k + 1)(1 + x)n−k k a = a . k! k!

Now, ∣n(n − 1)(n − 2) . . . (n − k + 1)∣ is bounded above, if not by n!, then by some other expression involving n.

Similarly, ∣(1 + x)n−k ∣ is bounded above by 2n .

Finally, ∣ak ∣ is bounded above by 1, because a ∈ (−1, 1).

g (k) (x) k a = 0. And so by Corollary 8, for all x ∈ (−1, 1), k→∞ k! n(n − 1) 2 we have (1 + x)n = M (x) = 1 + nx + x + ... 2! So indeed, for all x ∈ (−a, a), lim

Note that in contrast, if a ∉ (−1, 1), then ∣ak ∣ is not bounded from above and thus there is g (k) (x) k no guarantee that lim a = 0. k→∞ k!

Example 627. We verify that the function h ∶ R → R defined by x ↦ ex satisfies the “nice” property at a for all a ∈ R: For all x ∈ (−a, a) for all k ∈ Z+ , we have h(k) (x) k ex k a = a . k! k!

Since a is fixed, eventually the ending terms in the product

ak a a a a a = ⋅ ⋅ ... ⋅ will k! 1 2 3 k−1 k

ak = 0. And so as desired, we have k→∞ k!

all be less than 1. And so lim

ea k 1 ex k lim a < lim a = ea lim ak = 0. k→∞ k! k→∞ k! k→∞ k!

And so indeed, R is the range of values for which the Maclaurin series converges to h. That is, h(x) is equal to its Maclaurin series for all x ∈ R.

Page 898, Table of Contents

www.EconsPhDTutor.com

Example 628. We verify that the function i ∶ R → R defined by x ↦ sin x satisfies the “nice” property at a for all a ∈ R: For all x ∈ (−a, a) for all k ∈ Z+ , we have ⎧ cos x k ⎪ ⎪ a , ⎪ ⎪ ⎪ k! ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ − sin x k ⎪ ⎪ ⎪ a , ⎪ ⎪ ⎪ k! (k) i (x) k ⎪ ⎪ a =⎨ ⎪ k! ⎪ ⎪ − cos x k ⎪ ⎪ ⎪ a , ⎪ ⎪ k! ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ sin x k ⎪ ⎪ a , ⎪ ⎪ ⎩ k!

if k ≡ 1 ( mod 4) , if k ≡ 2 ( mod 4) ,

.

if k ≡ 3 ( mod 4) ,

if k ≡ 4 ( mod 4) .

ak Again, lim = 0. Moreover, ± sin x, ± cos x have maximum absolute value of 1. Thus, k→∞ k! i(k) (x) k a = 0, as desired. lim k→∞ k!

And so indeed, R is the range of values for which the Maclaurin series converges to i. That is, i(x) is equal to its Maclaurin series for all x ∈ R.

I skip the verification for cos x because it is almost exactly identical.

Page 899, Table of Contents

www.EconsPhDTutor.com

Example 629. We verify that the function f ∶ (−1, ∞) → R defined by x ↦ ln(1 + x) satisfies the “nice” property at a for all a ∈ (−1, 1]: For all x ∈ (−a, a) for all k ∈ Z+ , we have f (k) (x) k (−1)k−1 (k − 1)! k (−1)k−1 a k a = a = ( ) . k! k!(1 + x)k k 1+x

f (k) (x) k f (k) (x) k Case #1. If a = 0, then a = 0, so that indeed lim a = 0. k→0 k! k! Case #2. If a > 0, then

a a a 1 1 ∈( , ) = (1 − , − 1) ⊆ (0, 1) 1+x 1+a 1−a 1+a 1−a

a k ) =0 Ô⇒ lim ( k→∞ 1 + x

f (k) (x) k (−1)k−1 a k (−1)k−1 a k Ô⇒ lim a = lim ( ) = [ lim ] [ lim ( ) ] = 0. k→∞ k→∞ k→∞ k→∞ 1 + x k! k 1+x k

Case #3. If a < 0, then

a a a 1 1 ∈( , )=( − 1, 1 − ) ⊆ (−1, 0) 1+x 1−a 1+a 1−a 1+a

a k ) =0 Ô⇒ lim ( k→∞ 1 + x

f (k) (x) k (−1)k−1 a k (−1)k−1 a k a = lim ( ) = [ lim ] [ lim ( ) ] = 0. k→∞ k→∞ k→∞ k→∞ 1 + x k! k 1+x k

Ô⇒ lim

And so indeed, (−1, 1] is the range of values for which the Maclaurin series converges to f . That is, f (x) is equal to its Maclaurin series for all x ∈ (−1, 1]. a f (k) (x) k could be greater than 1, so that lim a = k→∞ 1+x k! (−1)k−1 a k f (k) (x) k lim ( ) ≠ 0. So there is no guarantee that lim a = 0. k→∞ k→∞ k 1+x k!

Note that in contrast, if a > 1, then

Page 900, Table of Contents

www.EconsPhDTutor.com

84.15

Product of Two Power Series

Fact 97. Let f ∶ F → R and g ∶ G → R be functions and F, G be sets. Suppose f (x) = a0 + a1 x + a2 x2 + a3 x3 + . . . for all x ∈ F and g(x) = b0 + b1 x + b2 x2 + b3 x3 + . . . for all x ∈ G. Then for all c ∈ F ∩ G, f (c)g(c) = a0 b0 + (a0 b1 + a1 b0 ) c + (a0 b2 + a1 b1 + a2 b0 ) c2 + (a0 b3 + a1 b2 + a2 b1 + a3 b0 ) c3 + . . .

Proof. Let Fn (x) and Gn (x) be the nth-order polynomials for f and g. Let c ∈ F ∩ G. Observe that n

Fn (c)Gn (c) = a0 b0 + (a0 b1 + a1 b0 ) c + (a0 b2 + a1 b1 + a2 b0 ) c2 + ⋅ ⋅ ⋅ + (∑ ai bn−i ) c2n . i=0

Let > 0. Our goal is to show that ∃N ∶ ∀n ≥ N , Fn (c)Gn (c) ∈ (f (c)g(c) − , f (c)g(c) + ).

Case #1: f (c), g(c) ≥ 0.

Pick f , g > 0 such that f (c)g + g(c)f + f g < , f < ∣f (c)∣, and g < ∣g(c)∣. Note that < 1

1

⇐⇒ −f (c)g − g(c)f − f g > − Ô⇒ −f (c)g − g(c)f + f g > −. 2

By definition, ∀f > 0, ∃Nf > 0: ∀n ≥ Nf , we have Fn (c) ∈ (f (c) − f , f (c) + f ) and ∀g > 0, ∃Ng > 0: ∀n ≥ Ng , we have Gn (c) ∈ (g(c) − g , g(c) + g ). So let N = max {Nf , Ng }. Now, ∀n ≥ N ,

Fn (c)Gn (c) ∈ (f (c)g(c) − f (c)g − g(c)f + f g , f (c)g(c) + f (c)g + g(c)f + f g ) .

But by construction, −f (c)g − g(c)f + f g > − and f (c)g + g(c)f + f g < . Hence, the lattermost set is a subset of (f (c)g(c) − , f (c)g(c) + ). So we have as desired: 2

1

Fn (c)Gn (c) ∈ (f (c)g(c) − , f (c)g(c) + ) .

(... Proof continued on the next page ...)

Page 901, Table of Contents

www.EconsPhDTutor.com

(... Proof continued from the previous page ...) Case #2: f (c), g(c) < 0.

Pick f , g > 0 such that −f (c)g − g(c)f + f g < , f < ∣f (c)∣, and g < ∣g(c)∣. Note that < 1

1

⇐⇒ f (c)g + g(c)f − f g > − Ô⇒ f (c)g + g(c)f + f g > −. 2

By definition, ∀f > 0, ∃Nf > 0: ∀n ≥ Nf , we have Fn (c) ∈ (f (c) − f , f (c) + f ) and ∀g > 0, ∃Ng > 0: ∀n ≥ Ng , we have Gn (c) ∈ (g(c) − g , g(c) + g ). So let N = max {Nf , Ng }. Now, ∀n ≥ N ,

Fn (c)Gn (c) ∈ (f (c)g(c) + f (c)g + g(c)f + f g , f (c)g(c) − f (c)g − g(c)f + f g ) .

But by construction, f (c)g + g(c)f + f g > − and −f (c)g − g(c)f + f g < . Hence, the lattermost set is a subset of (f (c)g(c) − , f (c)g(c) + ). So we have as desired: 2

1

Fn (c)Gn (c) ∈ (f (c)g(c) − , f (c)g(c) + ) .

Case #3: f (c) ≥ 0, g(c) < 0.

Pick f , g > 0 such that −f (c)g + g(c)f − f g > −, f < ∣f (c)∣, and g < ∣g(c)∣. Note that 1

> ⇐⇒ f (c)g − g(c)f + f g < Ô⇒ f (c)g − g(c)f − f g < 1

2

By definition, ∀f > 0, ∃Nf > 0: ∀n ≥ Nf , we have Fn (c) ∈ (f (c) − f , f (c) + f ) and ∀g > 0, ∃Ng > 0: ∀n ≥ Ng , we have Gn (c) ∈ (g(c) − g , g(c) + g ). So let N = max {Nf , Ng }. Now, ∀n ≥ N ,

Fn (c)Gn (c) ∈ (f (c)g(c) − f (c)g + g(c)f − f g , f (c)g(c) + f (c)g − g(c)f − f g ) .

But by construction, −f (c)g + g(c)f − f g > − and f (c)g − g(c)f − f g < . Hence, the lattermost set is a subset of (f (c)g(c) − , f (c)g(c) + ). So we have as desired: 1

2

Fn (c)Gn (c) ∈ (f (c)g(c) − , f (c)g(c) + ) .

Case #4: f (c) < 0, g(c) ≥ 0 is similar to Case #3 and the proof is thus omitted. Page 902, Table of Contents

www.EconsPhDTutor.com

84.16

Composition of Two Functions

Fact 98. Let f ∶ F → R and g be functions. Suppose f (x) = a0 + a1 x + a2 x2 + a3 x3 + . . . for all x ∈ F . Then for all c such that g(c) ∈ F , we have f (g(x)) = a0 + a1 g(c) + a2 [g(c)] + a3 [g(c)] + . . . 2

3

Proof. Let xg = g(c). Since xg ∈ F , by assumption, f (xg ) = a0 +a1 xg +a2 (xg ) +a3 (xg ) +. . . . 2 3 That is, f (g(x)) = a0 + a1 g(c) + a2 [g(c)] + a3 [g(c)] + . . . , as desired. 2

3

Fact 99. Let f ∶ F → R and g ∶ G → R be functions. Suppose f (x) = a0 +a1 x+a2 x2 +a3 x3 +. . . for all x ∈ F and g(x) = b0 + b1 x + b2 x2 + b3 x3 + . . . Then ∀c ∈ G ∶ g(c) ∈ F , we have f (g(x)) = a0 + a1 (b0 + b1 x + b2 x2 + b3 x3 + . . . ) + a2 (b0 + b1 x + b2 x2 + b3 x3 + . . . ) + . . . 2

Proof. Let Tn (x) = a0 + a1 g(c) + a2 [g(c)] + ⋅ ⋅ ⋅ + an [g(c)] and 2

n

Sn,k (x) = a0 + a1 (b0 + b1 x + b2 x2 + ⋅ ⋅ ⋅ + bk xk ) + a2 (b0 + b1 x + b2 x2 + ⋅ ⋅ ⋅ + bk xk ) + . . . 2

⋅ ⋅ ⋅ + an (b0 + b1 x + b2 x2 + ⋅ ⋅ ⋅ + bk xk ) . n

Clearly, lim Sn,k (x) = Tn (x) k→∞

By Fact 98, lim Tn (x) = f (g(x)). n→∞

Thus, lim [ lim Sn,k (x)] = f (g(x)), as desired. n→∞ k→∞

Page 903, Table of Contents

www.EconsPhDTutor.com

84.17

The Fundamental Theorems of Calculus

Lemma 7. (Limit inequalities.) Let f ∶ [a, b] → R be a continuous function and c, d ∈ R be constants. (a) If f (x) < c for all x ∈ (a, b), then f (a), f (b) ≤ c.

(b) If f (x) > d for all x ∈ (a, b), then f (a), f (b) ≥ d.

The stronger claim that “if f (x) < c for all x ∈ (a, b), then lim f (x) < c” is false. Consider x↗b

1 for example a = 0.5, b = 1, f (x) = 1 − , and c = 1. Then indeed f (x) < c for all x ∈ (a, b), x BUT lim f (x) = c. x↗b

Proof. (a) If f (a) > c, then by the continuity of f , ∃δ > 0 ∶ ∀x ∈ (a, a + ), f (x) > c, contradicting our assumption that f (x) < c for all x ∈ (a, b). The proof that f (b) ≤ c is similar and thus omitted. The proof of (b) is similar and thus omitted.

Page 904, Table of Contents

www.EconsPhDTutor.com

Definition 167. Let f be a real function on a real variable that is continuous on [a, b]. For i = 1, 2, . . . , n, let Pi = [b + (i − 1)

b−a b−a ,b + i ], n n

xi = arg min f (x), x∈Pi

xi = arg max f (x). x∈Pi

Then the Lower n-Sum of f on [a, b] and Upper n-Sum of f on [a, b] is

respectively,

b−a n ∑ f (xi ) n i=1

and

b−a n ∑ f (xi ) . n i=1

b−a n ∑ f (xi ) are, n i=1

Definition 168. Let f be a real function on a real variable that is continuous on [a, b]. The Lower Integral of f on [a, b] is b−a n lim ∑ f (xi ) , ∫a f dx = n→∞ n i=1 b

provided this limit exists. The Upper Integral of f on [a, b] is b−a n f dx = lim ∑ f (xi ) , ∫a n→∞ n i=1 b

provided this limit exists.

Definition 169. (The Definite or Riemann integral.) Let f be a real function on a real variable that is continuous on [a, b]. If the Lower and Upper Integrals of f on [a, b]

exist and are equal, i.e. ∫ f dx = ∫ f dx ∈ R, then the definite (or Riemann) integral is a a b

b

denoted ∫ f dx and defined to be ∫ f dx = ∫ f dx = ∫ f dx. a a a a b

Page 905, Table of Contents

b

b

b

www.EconsPhDTutor.com

Lemma 8. Suppose ∫ f (x) dx and ∫ f (x) dx are well-defined. Then ∫ f (x) dx − a a a p

∫a

q

f (x) dx = ∫

p

q

q

p

f (x) dx.

Proof. Omitted. But here is an informal “proof”: “Clearly”, ∫ f (x) dx is simply the area a p

of the graph of f between a and p, while ∫ f (x) dx is simply the area of the graph of f a between a and q. And so the former minus the latter is simply the area of the graph of f q

between p and q, i.e. ∫ f (x) dx. p p

Lemma 9. Suppose a, b, c ∈ R are constants. Then c = ∫

b a

c dx. b−a

b c Proof. Omitted. But here is an informal “proof”: “Clearly”, ∫ dx is simply the area a b−a b c c c of a rectangle with base b − a and height . So ∫ dx = (b − a) × = c. b−a b−a a b−a

Lemma 10. Suppose that for all x ∈ [a, b], c < f (x) < d. Suppose moreover that ∫ f dx a b

is well-defined. Then (a − b)c < ∫

b

a

f dx < (a − b)d.

Proof. Omitted. But here is an informal “proof”: If for all x ∈ [a, b], f (x) = c, then by

Lemma 9, ∫ f dx = (b − a)c. And if for all x ∈ [a, b], f (x) = d, then by Lemma 9, a b

∫a f dx = (b − a)d. “Clearly” then, if for all x ∈ [a, b], c < f (x) < d, then (a − b)c < b

∫a f dx < (a − b)d. b

Page 906, Table of Contents

www.EconsPhDTutor.com

Theorem 20. (FTC1.) Suppose f be a real function on a real variable that is continuous

on [a, b] and F ∶ [a, b] → R is defined by the mapping F (c) = ∫ f (x) dx. Then F is an a indefinite integral of f . That is, ∀x ∈ [a, b], F ′ (x) = f (x). c

Proof. Let p, q ∈ [a, b], with p ≠ q. Then

F (p) − F (q) 1 − f (p) = [F (p) − F (q) − (p − q)f (q)] p−q p−q

p q 1 [ f (x) dx − ∫ f (x) dx − (p − q)f (q)] = p − q ∫a a

= 1

p p 1 1 2 [∫ f (x) dx − (p − q)f (q)] = [f (x) − f (q)] dx, p−q q p − q ∫q

where = uses Lemma 8 and = uses Lemma 9 (note that f (q) is simply a constant). 1

2

By the continuity of f , ∀x ∈ (q − δ, q + δ), f (q) − < f (x) ∈ f (q) + and hence − < f (x) − f (q) < . So if p ∈ (q − δ, q + δ), then by Lemma 10, −(p−q) < ∫ and thus

p

q

[f (x) − f (q)] dx < (p−q)

p 1 − < [f (x) − f (q)] dx < . p − q ∫q

p 1 [f (x) − f (q)] dx = 0. And so p→q p − q ∫q

This proves that lim

lim [ p→q

F (p) − F (q) − f (p)] = 0. p−q

But by the continuity of f , we know also that lim f (p) = f (q). Thus, p→q

F (p) − F (q) = f (q). p→q p−q

lim

F (p) − F (q) . And so indeed F ′ (q) = f (q). p→q p−q

By definition of the derivative, F ′ (q) = lim

Page 907, Table of Contents

www.EconsPhDTutor.com

84.18

The Natural Logarithm and Euler’s Number e

It took a long while, but we can now finally define the natural logarithm function (which we’ve been happily using all along)! Definition 170. The natural logarithm function, denoted ln, has domain R+ , codomain c 1 dx. R, and mapping rule ln c = ∫ 1 x

Graphically, ln a is the area between the curve y = 1/x and the x-axis, bounded by the vertical lines x = 1 and x = a. Example 630. The shaded blue area below is ln 6.

y

x 0 1 2 3 4 5 6 7 8 You probably learnt that ln x is the number such that eln x = x. Our definition is a little d 1 strange, but has the advantage that we can almost immediately prove that ln x = . dx x Fact 100. For all c ∈ R+ , we have

d 1 ln c = . dx c

1 Proof. Define f ∶ R+ → R by f (x) = . Let A be its corresponding area function. Then x c 1 d d d ln c = (∫ dx) = [A(c) − A(1)] = dx dx 1 x dx

Page 908, Table of Contents

by FTC1 =0 ³¹¹ ¹ ¹·¹ ¹ ¹ µ ³¹¹ ¹ · ¹ ¹ ¹µ dA(c) dA(1) 1 − = f (c) = . dx dx c

=f (c),

www.EconsPhDTutor.com

The usual properties of the natural logarithm may now be proven. Fact 101. (a) ln 1 = 0. (b) ln(xy) = ln x + ln y.

(c) ln(x/y) = ln x − ln y. (c) ln xn = n ln x.

Proof. (a) ln 1 = ∫

1 1

1 dx = 0. x

(b) Differentiate both sides with respect to x to get

1 dy 1 1 dy d ln(xy) = (y + x ) = + dx xy dx x y dx

d 1 1 dy (ln x + ln y) = + . Thus, ln(xy) and ln x + ln y are both indefinite integrals dt x y dx 1 1 dy . By Fact 60 then, ln(xy) = ln x + ln y + C. But for x = 1, for the same function + x y dx ln y = ln 1 + ln y + C = ln y + C, so C = 0. Thus, ln(xy) = ln x + ln y. and

(c) Similar to the above. d 1 n d (d) Differentiate both sides with respect to x to get ln xn = n nxn−1 = and (n ln x) = dx x x dx n n . Thus, ln xn and n ln x are both indefinite integrals for the same function . By Fact 60 x x then, ln xn = n ln x + C. But for x = 1, ln 1n = ln 1 = 0 and n ln 1 + C = C, so C = 0. Thus, ln xn = n ln x.

Page 909, Table of Contents

www.EconsPhDTutor.com

84.19

Euler’s Number

The number e ≈ 2.7182818284 . . . is sometimes called Euler’s number in honour of Euler.89 It is not to be confused with Euler’s constant γ ≈ 0.5772156649 . . . , a number you’ve probably never encountered. Definition 171. e is the unique number such that ln e = 1.

y

x 0 1 2 3 4 5 6 7 8 1 is strictly positive for x all x ∈ [1, ∞), so that ln x is strictly increasing. So there can be only one number e such that ln e = 1.

How do we know that e is indeed uniquely defined? It’s because

Fact 102. eln x = x for all x ∈ R. Proof. Differentiate eln x twice:

d ln x eln x e = , dx x

d2 ln x xeln x /x − eln x e = = 0. dx2 x

The only functions whose derivative is 0 are constant functions.90 Hence, the first derivative eln x of eln x is a constant. That is, = C or eln x = Cx. But we also know that for x = 1, x eln 1 = e0 = 1, so that C = 1. Hence, eln x = x, as desired. 89

Note though that it was simply Euler himself who happened to start using the letter e to denote this number. And presumably he was not doing it to honour himself. Calling it Euler’s number is simply an honour conferred by posterity. 90 As noted in n. 53, this textbook shall simply take this assertion for granted.

Page 910, Table of Contents

www.EconsPhDTutor.com

The next two theorems give the two alternative definitions of e. ∞

1 1 1 1 1 Theorem 21. e = ∑ . (Equivalently, e = + + + + . . . ) 0! 1! 2! 3! i=0 i! ∞

xi for all x ∈ R. Hence, Proof. From our study of Maclaurin series, we know that e = ∑ i=0 i! ∞ 1 e = e1 = ∑ , as desired. i=0 i! x

1 n Theorem 22. lim (1 + ) = e. n→∞ n 1 1 Proof. Let n ∈ (1, ∞) and x ∈ [1, ]. Then ≤ 1, so that n x 1

1

1+ n 1 1+ n 1 1 ln (1 + ) = ∫ dx ≤∫ 1 dx = . n x n 1 1 1

1

1+ n 1 1+ n 1 n 1 1 n Similarly, ≥ , so that ln (1 + ) = ∫ dx ≥ ∫ dx = . x n+1 n x n+1 n+1 1 1

Altogether, Ô⇒

1 1 1 ≤ ln (1 + ) ≤ n+1 n n e

1 n+1

1 ≤ (1 + ) ≤ e1/n n

Taking limits,

Ô⇒ e n+1 ≤ eln(1+ n ) 1

Ô⇒ e

n n+1

1

≤ e1/n

1 n n/n ≤ (1 + ) ≤ e = e. n

n 1 n lim e n+1 ≤ lim (1 + ) ≤ lim e. n→∞ n→∞ n→∞ n

Since lim e n+1 = e and lim e = e, by the Squeeze Theorem (Theorem 17), we must have n→∞ n→∞ 1 n lim (1 + ) = e. n→∞ n n

Page 911, Table of Contents

www.EconsPhDTutor.com

85

Appendices for Part VI: Probability and Statistics 85.1

How to Count

Theorem 23. (AP.) If A and B are disjoint, finite sets, then ∣A ∪ B∣ = ∣A∣ + ∣B∣. Proof. Let A = {a1 , a2 , . . . , ap } and B = {b1 , b2 , . . . , bq }. Then

A ∪ B = {a1 , a2 , . . . , ap , b1 , b2 , . . . , bq } .

We have ∣A∣ = p, ∣B∣ = q, and ∣A ∪ B∣ = p + q. The result follows. Corollary 9. If A1 , A2 , . . . , An are disjoint, finite sets, then

∣∪ni=1 Ai ∣

n

= ∑ ∣Ai ∣. i=1

Proof. By induction (details omitted). Theorem 24. (MP.) If A and B are finite sets, then ∣A × B∣ = ∣A∣ × ∣B∣. Proof. Let A = {a1 , a2 , . . . , ap } and B = {b1 , b2 , . . . , bq }. Then

⎧ ⎪ ⎪ A × B = ⎨ (a1 , b1 ) , (a1 , b2 ) , . . . , (a1 , bq ) , (a2 , b1 ) , (a2 , b2 ) , . . . , (a2 , bq ) , . . . , ⎪ ⎪ ⎩ ⎫ ⎪ ⎪ . . . , (ap , b1 ) , (ap , b2 ) , . . . , (ap , bq ) ⎬. ⎪ ⎪ ⎭

We have ∣A∣ = p, ∣B∣ = q, and ∣A × B∣ = pq. The result follows.

Corollary 10. If A1 , A2 , . . . , An are finite sets, then ∣×ni=1 Ai ∣ = Πni=1 ∣Ai ∣. Proof. By induction (details omitted).

Page 912, Table of Contents

www.EconsPhDTutor.com

Theorem 25. (IEP.) If A and B are finite sets, then ∣A ∪ B∣ = ∣A∣ + ∣B∣ − ∣A ∩ B∣. Proof. A ∪ B = (A/ (A ∩ B)) ∪ B. So by the AP, ∣A ∪ B∣ = ∣A/ (A ∩ B) ∣ + ∣B∣. 1

Now, (A/ (A ∩ B)) ∪ (A ∩ B) = A. So also by the AP, ∣A/ (A ∩ B)∣ + ∣A ∩ B∣ = ∣A∣ or 2 2 1 ∣A/ (A ∩ B)∣ = ∣A∣ − ∣A ∩ B∣. Plug = into = to get the desired result.

Corollary 11. If A1 , A2 , A3 , are finite sets, then ∣A1 ∪ A2 ∪ A3 ∣ = ∣A1 ∣ + ∣A2 ∣ + ∣A3 ∣ − ∣A1 ∩ A2 ∣ − ∣A1 ∩ A3 ∣ − ∣A2 ∩ A3 ∣ + ∣A1 ∩ A2 ∩ A3 ∣ .

Proof. Similar to the previous proof, just more tedious. And here’s the generalisation of the IEP: Corollary 12. If A1 , A2 , . . . , An , are finite sets, then ∣∪ni=1 Ai ∣

n

= ∑ ∣Ai ∣ − i=1

∣Ai ∩ Aj ∣ + ∣Ai ∩ Aj ∩ Ak ∣ − ⋅ ⋅ ⋅ + (−1) ∑ ∑ i,j distinct i,j,k distinct

n−1

∣∩ni=1 Ai ∣ .

Proof. By induction (details omitted). Theorem 26. (CP.) If A and B are finite sets and A ⊆ B, then ∣A/B∣ = ∣A∣ − ∣B∣.

Proof. B and A/B are disjoint, finite sets. Moreover, B ∪ (A/B) = A. So by the AP, ∣B∣ + ∣A/B∣ = ∣A∣. Rearranging yields the desired result.

Corollary 13. If A is a finite set and B1 , B2 , . . . Bn ⊆ A are disjoint, then ∣A/ ∪ni=1

Proof. By the corollary to the AP,

Page 913, Table of Contents

n

Bi ∣ = ∣A∣ − ∑ ∣Bi ∣ .

∣∪ni=1 Bi ∣

i=1

n

= ∑ ∣Bi ∣. The result then follows by the CP. i=1

www.EconsPhDTutor.com

85.2

Circular Permutations

Consider n objects, only k of which are distinct. Let r1 , r2 , . . . , and rk be the numbers of times the 1st, 2nd, . . . , and kth distinct objects appear. We already know from Fact 63 that the number of (linear) permutations of these n objects is n! . r1 !r2 ! . . . rk ! We also know that m distinct objects have m! (linear) permutations and (m − 1)! circular permutations.

A reasonable conjecture might thus be that the number of circular permutations of the above n objects is (n − 1)! . r1 !r2 ! . . . rk ! The above conjecture sometimes “works” — e.g. SEE has 3!/2! = 3 (linear) permutations and SEE indeed also has (3 − 1)!/2! = 1 circular permutation. However and unfortunately, this conjecture is, in general, incorrect. Here are two counter-examples. Example 631. There are 3!/3! = 1 (linear) permutations of the three letters AAA.

If the above conjecture were true, then there ought to be (3 − 1)!/3! = 2!/3! = 1/3 circular permutations of AAA. But this is not even an integer, so obviously it cannot be the number of circular permutations of AAA. In fact, there is also exactly 1 circular permutation of AAA. Example 632. There are 6!/ (3!3!) = 20 (linear) permutations of the six letters AAABBB.

If the above conjecture were true, then there ought to be (6 − 1)!/ (3!3!) = 10/3 circular permutations of AAABBB. But this is not even an integer, so obviously it cannot be the number of circular permutations of AAABBB. In fact, there are exactly 4 circular permutations of AAABBB. A general solution (i.e. formula) is possible but is a bit too advanced for A-levels.91

91

See e.g. this Handbook on Combinatorics.

Page 914, Table of Contents

www.EconsPhDTutor.com

85.3

Probability

Proposition 12 (reproduced from p. 549 above). Let S be the sample space, Σ be the corresponding event space, and A, B be events. If the probability function P ∶ Σ → R satisfies the Kolmogorov Axioms, then P also satisfies the following properties: 1. Complements. P(A) = 1 − P (Ac ). 2. Probability of Empty Event is Zero. P(∅) = 0. 3. Monotonicity. If B ⊆ A, then P(B) ≤ P(A). 4. Probabilities Are At Most One. P(A) ≤ 1.

5. Inclusion-Exclusion. P(A ∪ B) = P(A) + P(B) − P(A ∩ B). Proof. (Continued from p. 549.) 2. Probability of Empty Event is Zero. ∅ ∩ A = ∅ and ∅ ∪ A = A. And so again by the Additivity Axiom, P(∅ ∪ A) = P(A) = P(∅) + P(A). Thus, P(∅) = 0. But also by definition, A ∪ Ac = S. Hence, P(A ∪ Ac ) = P(S). By the Normalisation Axiom, P(S) = 1.

3. Monotonicity. A ∩ {B/A} = ∅ and A ∪ {B/A} = B. Thus, by the Additivity Axiom, P(B) = P(A) + P(B/A). By the Non-Negativity Axiom, P(B/A) ≥ 0. Hence, P(B) ≥ P(A).

4. Probabilities Are At Most One. Any event A is a subset of S. And so by Monotonicity, P(A) ≤ P(S). But by the Normalisation Axiom, P(S) = 1. Thus, P(A) ≤ 1.

5. Inclusion-Exclusion Principle. By the Additivity Axiom, P(A∪B) = P(A)+P(B/A).

Also by the Additivity Axiom, P(A ∩ B) + P(B/A) = P(B). Altogether then, P(A ∪ B) = P(A) + P(B) − P(A ∩ B).

Page 915, Table of Contents

www.EconsPhDTutor.com

85.4

Random Variables

Proposition 13 (reproduced from p. 588). The expectation operator E is linear. That is, if X and Y are random variables and c is a constant, then (a) Additivity: E[X + Y ] = E [X] + E [Y ],

(b) Homogeneity of degree 1: E[cX] = cE [X].

Proof. This proposition applies even for non-discrete random variables. But we’ll prove this proposition only for the case where the random variable is discrete. We’ll use the linearity of the expectation operator. We prove (b) first. (b) E[cX] =

(a) =

=

=

P (X = k) ⋅ k = cE [X] . P (X = k) ⋅ (ck) = c ∑ ∑ k∈Range(X) k∈Range(X)

E [X + Y ]

P (X = k, Y = l) ⋅ (k + l) ∑ ∑ k∈Range(X) l∈Range(Y )

k P (X = k, Y = l) + l P (X = k, Y = l) ∑ ∑ ∑ ∑ k∈Range(X) l∈Range(Y ) l∈Range(Y ) k∈Range(X) kP (X = k) + lP (Y = l) ∑ ∑ k∈Range(X) l∈Range(Y )

= E [X] + E [Y ] .

Page 916, Table of Contents

www.EconsPhDTutor.com

Proposition 14 (reproduced from p. 597). Let X and Y be independent random variables. Let c be a constant. Then (a) Additivity: V[X + Y ] = V [X] + V [Y ],

(b) Homogeneity of degree 2: V[cX] = c2 V [X].

Proof. We use Fact 72 and the linearity of the expectation operator. (b) V[cX] = E [(cX) ] − (cµX ) = c2 E [X 2 ] − c2 µ2X = c2 (E [X 2 ] − µ2X ) = c2 V[X]. 2

2

To prove (a), we’ll also use Lemma 11: V [X + Y ] = E [(X + Y ) ] − (E [X + Y ]) 2

2

= E [X 2 + Y 2 + 2XY ] − (E [X] + E [Y ])

2

= E [X 2 ] + E [Y 2 ] + 2E [XY ] − (µ2X + µ2Y + 2µX µY ) = E [X 2 ] − µ2X + E [Y 2 ] − µ2Y + 2(E [XY ] − µX µY ) . ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¶ ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶ ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶ V[X]

V[Y ]

0

Lemma 11. If X and Y are independent random variables, then E [XY ] = E [X] E [Y ]. Proof. We prove this Lemma only for the case where X and Y are both discrete. E [XY ] = ∑ ∑ P (X = k, Y = l) ⋅ kl k

l

= ∑ ∑ P (X = k) P (Y = l) ⋅ kl k

(independence)

l

= ∑ (P (X = k) k ∑ P (Y = l) ⋅ l) = ∑ (P (X = k) kE [Y ]) k

l

k

= E [Y ] ∑ P (X = k) k = E [Y ] E [X] . k

Page 917, Table of Contents

www.EconsPhDTutor.com

Fact 103. (a) Let X be the number of fair coin-flips until we get two consecutive heads. Let Y be the number of fair coin-flips until we get HT consecutively. Then E[X] = µX = 6 and E[Y ] = µY = 4. (b) Flip a fair coin n + 1 times. This gives us n pairs of consecutive coin-flips. Let A be the proportion of these n pairs of consecutive coin-flips that are HH. Let B be the proportion that are HT . Then E[A] = µA = 1/4 and E[B] = µB = 1/4. Proof. (a) To find µX actually requires a clever, new trick. Let

p = E [Additional number of flips to get HH∣Last flip was T ] , q = E [Additional number of flips to get HH∣Last two flips were T H] .

Observe that p is the number of flips, if we’re “restarting” . Thus, p = µX . Now, q = P (Next flip is H) × 1 + P (Next flip is T ) × (1 + p) = 0.5 × 1 + 0.5 × (1 + p) = 1 + 0.5p.

(Explanation: If the next flip is H, then we’ve completed HH and this took us only 1 more flip. If instead the next flip is T , then we start all over again; we’ve already taken 1 flip and are expected to take another p flips.) Similarly, observe that p = P (Next flip is H) × (1 + q) + P (Next flip is T ) × (1 + p) = 0.5 × (2 + 0.5p) + 0.5 × (1 + p) = 1.5 + 0.75p.

(Explanation: If the next flip is H, then we expect to take, in addition, another q flips. If instead the next flip is T , then we start all over again; we’ve already taken 1 flip and are expected to take another p flips.) Hence, p = 6 = µX . The reasoning used above is illustrated by the probability tree below.

Let’s now find µY . Again, let

r = E [Additional number of flips to get HT ∣Last two flips were T T ] , s = E [Additional number of flips to get HT ∣Last flip was H] .

Observe that r is the number of flips, if we’re “restarting”. Thus, r = µX . (... Proof continued on the next page ...) Page 918, Table of Contents

www.EconsPhDTutor.com

(... Proof continued from the previous page ...)

Now, also observe that s = P (Next flip is T ) × 1 + P (Next flip is H) × (1 + s) = 0.5 × 1 + 0.5 × (1 + s) = 1 + 0.5s.

(Explanation: If the next flip is T , then we’ve completed HT and this took us only 1 more flip. If instead the next flip is H, then we’ve already taken 1 flip and are expected to take another s flips.) So s = 2. Similarly, observe that

r = P (Next flip is H) × (1 + s) + P (Next flip is T ) × (1 + r) = 0.5 × (1 + 2) + 0.5 × (1 + r) = 2 + 0.5r.

(Explanation: If the next flip is H, then we’ve already taken 1 flip and are expected to take another s flips. If the next flip is T , then we’ve already taken 1 flip and are expected to take another r flips.) So r = 4 = µY . (b) Let Si be the random variable that indicates whether the ith pair of consecutive coinflips is HH. That is, Si = 1 if so and Si = 0 if not. Then A=

And so,

S1 + S2 + ⋅ ⋅ ⋅ + Sn . n

E [A] = E [

But E [Si ] = 1/4. Thus, E [A] = 1/4.

S1 + S2 + ⋅ ⋅ ⋅ + Sn 1 n ] = ∑ E [Si ] . n n i=1

The proof that E [B] = 1/4 is similar.

Page 919, Table of Contents

www.EconsPhDTutor.com

85.5

The Poisson Distribution

Fact 77 (reproduced from p. 615). Let X ∼ Po(λ). Then E[X] = λ and V[X] = λ.

Proof.

∞

∞

E[X] = ∑ P(X = k) ⋅ k = ∑ P(X = k) ⋅ k ∵ P(X = 0) ⋅ 0 = 0 k=0 ∞

=∑

k=1

λk e−λ k!

= λe

∞

k=1 ∞

⋅ k = e−λ ∑

λk−1 k=1 (k − 1)! ∞ k λ −λ = λe ∑ k=0 k! −λ

∑

λk k=1 (k − 1)!

Pull out constant

Pull out constant Change starting value of summation

= λe−λ eλ = λ.

Maclaurin series for ex

Similarly compute

∞

∞

E [X 2 ] = ∑ P(X = k) ⋅ k 2 = ∑ P(X = k) ⋅ k 2 k=0 ∞

=∑

k=1

=e

−λ

=e

−λ

=e

=e

−λ

−λ

λk e−λ k!

k=1 ∞

⋅ k 2 = e−λ ∑

λk k k=1 (k − 1)!

(∵P(X = 0) ⋅ 02 = 0)

(Pull out constant)

∞ ∞ λk λk λk −λ [(k − 1) + 1] = e { ∑ [ (k − 1)] + ∑ } ∑ k=1 (k − 1)! k=1 (k − 1)! k=1 (k − 1)! ∞

∞

∞ λk λk {∑ [ (k − 1)] + ∑ } k=2 (k − 1)! k=1 (k − 1)!

∞ λk−2 λk−1 [λ ∑ +λ∑ ] k=2 (k − 2)! k=1 (k − 1)! 2

∞

∞ k λk λ (λ ∑ +λ∑ ) k=0 k! k=0 k! 2

∞

= e−λ (λ2 eλ + λeλ ) = λ2 + λ.

(∵

λ1 (1 − 1) = 0) (1 − 1)!

(Change starting value of summation) (Maclaurin series for ex )

Hence, V[X] = E [X 2 ] − (E [X]) = λ2 + λ − λ2 = λ. 2

Page 920, Table of Contents

www.EconsPhDTutor.com

Theorem 13 (reproduced from p. 611). (The limit of the binomial random λ variable is the Poisson random variable.) Fix λ. Let Xn ∼ B (n, ). Let Y = lim Xn . n→∞ n Then Y is a random variable that satisfies the following two properties: • Range(Y ) = {0, 1, 2, 3, . . . } = Z+0 .

λk e−λ , for all k ∈ Z+0 . • The probability distribution of Y is given by P(Y = k) = k! Moreover, we call Y a Poisson random variable with parameter λ. Proof. Since the range of X is {0, 1, 2, . . . , n}, it follows that the range of Y = lim Xn is Z+0 . n→∞

X has probability distribution

⎛n⎞ λ k λ n−k P (Xn = k) = ( ) (1 − ) . n ⎝k ⎠ n

Now,

⎛n⎞ λ k λ n−k λ k λ n−k n! lim P(X = k) = lim ( ) (1 − ) = lim ( ) (1 − ) n→∞ n→∞ ⎝ k ⎠ n n→∞ k!(n − k)! n n n = =

=

= =

λk 1 k λ n−k n! lim ( ) (1 − ) (Move out terms not involving n) k! n→∞ (n − k)! n n λk λ n λ −k n! lim (1 − ) (1 − ) k! n→∞ nk (n − k)! n n λk n(n − 1) . . . (n − k + 1) λ n λ −k lim (1 − ) (1 − ) k! n→∞ nk n n λk n(n − 1) . . . (n − k + 1) λ n λ −k [ lim ][ lim (1 − ) ][ lim (1 − ) ] n→∞ n→∞ k! n→∞ nk n n ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¸ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶ =1 λk e−λ k!

.

=e−λ

1

Thus, Y indeed has the specified range and distribution.

Page 921, Table of Contents

www.EconsPhDTutor.com

85.6 ∞

Fact 104. ∫ e−x dx = −∞ 2

The Normal Distribution

√ π.

Proof. Omitted. See this Math.Stackexchange discussion or Wikipedia.

Fact 78 (reproduced from p. 629). Let Z ∼ N(0, 1) and φ and Φ be its PDF and CDF.

1. Φ(∞) = 1. (As with any random variable, the area under the entire PDF is 1.) 2. φ(a) > 0, for all a ∈ R. (The PDF is positive everywhere. This has a surprising implication: however large a is, there is always some non-zero probability that Z ≥ a.) 3. E [Z] = 0. (The mean of Z is 0.) 4. The PDF φ reaches √ a global maximum at the mean 0. (In fact, we can go ahead and compute φ (0) = 1/ 2π ≈ 0.399.) 5. V [Z] = 1. (The variance of Z is 1.)

6. P (Z ≤ a) = P (Z < a). (We’ve already discussed this earlier. It makes no difference whether the inequality is strict. This is because P(Z = a) = 0.) 7. The PDF φ is symmetric about the mean. This has several implications:

(a) P (Z ≥ a) = P (Z ≤ −a) = Φ(−a). (b) Since P (Z ≥ a) = 1 − P (Z ≤ a) = 1 − Φ(a), it follows that Φ(−a) = 1 − Φ(a) or, equivalently, Φ(a) = 1 − Φ(−a). (c) Φ(0) = 1 − Φ(0) = 0.5.

8. P (−1 ≤ Z ≤ 1) = Φ (1) − Φ (−1) ≈ 0.6827. (There is probability 0.6827 that Z takes on values within 1 standard deviation of the mean.)

9. P (−2 ≤ Z ≤ 2) = Φ (2) − Φ (−2) ≈ 0.9545. (There is probability 0.9545 that Z takes on values within 2 standard deviations of the mean.) 10. P (−3 ≤ Z ≤ 3) = Φ (3) − Φ (−3) ≈ 0.9973. (There is probability 0.9973 that Z takes on values within 3 standard deviations of the mean.) 11. The PDF φ has two points of inflexion, namely at ±1. (The points of inflexion are one standard deviation away from the mean.) √ √ Proof. 1. Let u = x/ 2. We have u2 = 0.5x2 and du/dx = 1/ 2. And using Fact 922: Φ(∞) = ∫

∞

−∞

x=∞ u=∞ 2 2 √ du 2 1 1 1 1 1 √ √ e−0.5x dx = √ ∫ e−0.5x 2 dx = √ ∫ e−u du = √ π = 1. dx π u=−∞ π 2π 2π x=−∞

(... Proof continued on the next page ...) Page 922, Table of Contents

www.EconsPhDTutor.com

(... Proof continued from the next page ...) 2. Obvious. E [Z] = ∫

3.

∞

−∞

4.

∞ 2 ∞ 2 −1 −1 −1 xφ(x) dx = √ ∫ (−xe−0.5x ) dx = √ [e−0.5x ] = √ [0 − 0] = 0. −∞ 2π −∞ 2π 2π

d 1 −0.5a2 d √ e φ(a) = da da 2π

⎧ ⎪ ⎪ > 0, ⎪ ⎪ −a −0.5a2 ⎪ ⎪ =√ e ⎨= 0, ⎪ 2π ⎪ ⎪ ⎪ ⎪ ⎪ ⎩< 0,

if a < 0,

if a = 0, if a > 0.

v′

∞ u ³¹¹ ¹ ¹ ¹ ·¹ ¹ ¹ ¹ ¹ µ 1 −0.5x2 1 © −0.5x2 V [Z] = ∫ (x − 0) φ(x) dx = ∫ x √ e dx = √ ∫ x xe dx −∞ −∞ 2π 2π −∞ ∞ ∞ 2 1 1 −0.5x2 −0.5x2 −∫ e = √ [e dx] = √ ∫ e−0.5x dx = 1. 2π 2π −∞ −∞ ∞

5.

2

∞

2

φ is continuous, increasing for a < 0 and decreasing for a > 0. Thus, φ reaches a global maximum√at 0. By plugging in a = 0, we can compute this global maximum value to be φ(0) = 1/ 2π ≈ 0.399.

6. By the Additivity Axiom, P (Z ≤ a) = P (Z < a, Z = a) = P (Z < a)+P (Z = a) = P (Z < a)+ 0 = P (Z < a), as desired. 2 √ 2 √ 7. Clearly, φ(a) = e−0.5a / 2π = e−0.5(−a) / 2π = φ(−a) for all a ∈ R. Thus, φ is symmetric about the vertical axis x = 0, which is also the mean. 7(a). Using the substitution u = −x, we have du/dx = −1 and P (Z ≥ a) = ∫

x=∞

x=a

u=−∞ −e−0.5u u=−a e−0.5u e−0.5x √ √ √ dx = ∫ du = ∫ du = P (Z ≤ −a) = Φ(−a). u=−a u=−∞ 2π 2π 2π 2

2

2

7(b) and 7(c). Obvious.

8, 9, and 10. These can be computed numerically, using a computer.

11.

d2 d −a −0.5a2 √ e φ(a) = da2 da 2π

⎧ ⎪ ⎪ > 0, ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ = 0, ⎪ ⎪ ⎪ 1 −0.5a2 2 ⎪ (a − 1) ⎨< 0, =√ e ⎪ 2π ⎪ ⎪ ⎪ ⎪ ⎪ = 0, ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩> 0,

if a < −1, if a = −1,

if − 1 < a < 1, if a = 1, if a > 1.

Hence, ±1 are the only two points of inflexion since φ changes concavity only here. Page 923, Table of Contents

www.EconsPhDTutor.com

Theorem 27. Let a, b ∈ R be constants with a ≠ 0 and X be a continuous random variable with PDF fX and CDF FX . Let Y = aX + b. Then fY (c) =

c−b 1 fX ( ). ∣a∣ a

Proof. FY (c) = P (Y ≤ c) = P (aX + b ≤ c) = P (aX ≤ c − b).

Case #1. If a > 0, then FY (c) = ⋅ ⋅ ⋅ = P (aX ≤ c − b) = P (X ≤ Now differentiate:

c−b c−b ) = FX ( ). a a

d d c−b 1 c−b 1 c−b FY (c) = FX ( ) = fY (c) = fX ( ) = fX ( ). da dc a a a ∣a∣ a

Case #2. If a < 0, then FY (c) = ⋅ ⋅ ⋅ = P (aX ≤ c − b) = P (X ≥ Now differentiate:

c−b c−b ) = 1 − FX ( ). a a

d d c−b 1 c−b 1 c−b FY (c) = [1 − FX ( )] = fY (c) = − fX ( ) = fX ( ). da dc a a a ∣a∣ a

Fact 79 (reproduced from p. 638). Let X ∼ N (µ, σ 2 ) and a, b ∈ R be constants. Then aX + b ∼ N (aµ + b, a2 σ 2 ). Proof. By Theorem 27, the PDF of aX + b is given by 1 c−b 1 1 −0.5( √ e faX+b (c) = fX ( )= ∣a∣ a ∣a∣ σ 2π

c−b −µ 2 a ) σ

2 1 −0.5[ c−(aµ+b) ] aσ √ e = . ∣a∣ σ 2π

But this lattermost expression is indeed the PDF of the random variable with distribution N (aµ + b, a2 σ 2 ).

Page 924, Table of Contents

www.EconsPhDTutor.com

85.7

Sampling

Fact 81 (reproduced from p. 678). Let S = (X1 , X2 , . . . , Xn ) be a random sample of ¯ be the sample mean and S 2 be the sample variance. Let a ∈ R be a constant. size n. Let X Then [∑n i=1 Xi ] n

∑i=1 Xi2 − 2 (a) S = n−1 n

2

[∑ (X −a)] n ∑i=1 (Xi − a) − i=1 n i 2 and (b) S = . n−1 2

n

2

Proof. This proof may look intimidating but it’s really just a bunch of tedious algebra. (I’ve also tried to go slow with the algebra, so more steps are explicitly listed than is typical in a proof.) (a) Start from the definition of the sample variance and do the algebra: n ¯ 2 ∑ni=1 (X 2 + X ¯ 2 − 2XX ¯ i ) ∑ni=1 X 2 − ∑ni=1 X ¯ 2 − ∑ni=1 (2XX ¯ i) ∑i=1 (Xi − X) i i S = = = n−1 n−1 n−1 n n n 2 2 2 ¯ ¯ ¯ 2 ¯ − 2X ¯ ∑i=1 Xi ∑i=1 Xi − nX − 2X (nX) ∑ni=1 X 2 − nX ¯2 ∑i=1 Xi − nX i = = = n−1 n−1 n−1 2

=

∑i=1 Xi2 − n [ n

n−1

∑i=1 Xi n ] n

2

[∑n i=1 Xi ] n

∑i=1 Xi2 − = n−1 n

2

.

(b) Start from the formula found in (a) and do the algebra: S2 =

= =

=

=

[∑ (X −a+a)] n − [∑i=1nXi ] ∑i=1 (Xi − a + a) − i=1 ni = n−1 n−1 2 n n 2 n 2 + 2 (X − a) a] − [∑i=1 (Xi −a)+∑i=1 a] [(X − a) + a ∑i=1 i i n n−1 2 2 n n 2 [∑n (X −a)] +(∑n n n n i=1 a) +2 ∑i=1 (Xi −a) ∑i=1 a ∑i=1 (Xi − a) + ∑i=1 a2 + 2a ∑i=1 (Xi − a) − i=1 i n n−1 2 2 2 [∑n (X −a)] +(na) +2na ∑n n n i=1 (Xi −a) ∑i=1 (Xi − a) +na2 +2a ∑i=1 (Xi − a) − i=1 i n n−1 2 2 [∑n (X −a)] n ∑i=1 (Xi − a) − i=1 n i . n−1

n ∑i=1 Xi2

n

Page 925, Table of Contents

2

2

n

2

www.EconsPhDTutor.com

Proposition 15 (reproduced from p. 684). Let (X1 , X2 , . . . , Xn ) be a random sample ¯ be drawn from a distribution with population mean µ and population variance σ 2 . Let X 2 the sample mean and S be the sample variance. Then ¯ = µ. (a) E [X]

(b) E [S 2 ] = σ 2 .

And

Proof. (a) was proven in Exercise 259. We prove only (b). Equation = is the key piece of intuition (and is formally proven below): 1

The degree to which The degree to which The degree to which ¯ ¯ varies from µ Xi varies from X X Xi varies from µ ³¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ · ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹µ ³¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ · ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹µ ³¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹·¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ µ 2 2 1 2 ¯ ¯ = E [(Xi − µ) ] . E [(Xi − X) ] + E [(X − µ) ]

Rearranging:

Population variance Variance of sample mean ³¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ · ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹µ ³¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹·¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ µ σ2 n − 1 2 2 2 2 2 ¯ ¯ = σ . E [(X − µ) ] =σ − E [(Xi − X) ] = E [(Xi − µ) ] − n n

¯ 2 is a biased estimator for σ 2 . And in turn, S 2 is not: We’ve just shown that (Xi − X)

n ⎡ n (X − X) ⎡ n n (X − X) ¯ 2] ¯ 2 ⎤⎥ ¯ 2 ⎤⎥ ∑ni=1 E [ n−1 (Xi − X) ⎢ ⎢ ∑ ∑ nσ 2 i i i=1 i=1 n−1 ⎥ = E⎢ ⎥= E [S 2 ] = E ⎢⎢ = = σ2. ⎥ ⎥ ⎢ n−1 n n n ⎢ ⎥ ⎢ ⎥ ⎣ ⎦ ⎣ ⎦

As promised, here is the proof of equation =: 1

¯ ] + E [(X ¯ − µ) ] = E [(Xi − X) ¯ + (X ¯ − µ) ] E [(Xi − X) ¯ + (X ¯ − µ)) 2 − 2 (Xi − X) ¯ (X ¯ − µ)] = E [((Xi − X) 2

2

2

2

¯ (X ¯ − µ)] = E [(Xi − µ) 2 − 2 (Xi − X) ¯ − µXi − X ¯ 2 + µX)] ¯ = E [(Xi − µ) 2 − 2 (Xi X

¯ −X ¯ 2 )] = E [(Xi − µ) 2 − 2 (Xi X ¯ 2 ] − E [XX ¯ i ]} = E [(Xi − µ) 2 ] + 2 {E [X = E [(Xi − µ) 2 ] .

n n ¯ i] ∑i=1 E [XX ∑i=1 Xi ¯ ¯ 2 ]. ¯ The last equality follows because E [XXi ] = = E [X ] = E [X n n

Page 926, Table of Contents

www.EconsPhDTutor.com

85.8

Null Hypothesis Significance Testing

Definition 172. The random variable Tν with Student’s t-distribution with ν degrees of freedom has PDF f ∶ R → R given by mapping rule f (t) =

Page 927, Table of Contents

∞ ν+1 −1 −x ∫0 x 2 e dx √ ∞ ν νπ ∫0 x 2 −1 e−x dx

t2 (1 + ) ν

− ν+1 2

.

www.EconsPhDTutor.com

85.9

Calculating the Margin of Error

Let µ be the true population proportion (of votes for Dr. Chee). Say we take a random sample of size 900.92 Let X be the sample number of votes for Dr. Chee. We know that X ∼ B (900, µ). Our confidence level is 95%. So we want to find the smallest k such that P (900µ − k ≤ X ≤ 900µ + k) ≥ 0.95.

And ±k/900 will be our margin of error.

Case #1: Perfect hindsight: µ = 9142/23570.

With perfect hindsight, we now know that µ = 9142/23570. So X ∼ B (900, 9142/23570). We want to find the smallest k such that P (349 − k ≤ X ≤ 349 + k) ≥ 0.95.

where 900 × 9142/23570 ≈ 349. Using the “Binomial” sheet at the usual link, we have P (349 − 28 ≤ X ≤ 349 + 28) ≈ 0.9488, P (349 − 29 ≤ X ≤ 349 + 29) ≈ 0.9565.

Thus, k = 29. Now, 29/900 ≈ 3.2%. Thus, at a 95% confidence level, the margin of error is ±3.2%. This is the “true” margin of error, assuming we know µ. But this assumption defeats the point of sampling — we don’t know µ, which is why we’re doing sampling in the first place! What we want instead is the margin of error in the case where µ is unknown. Case #2: Without perfect hindsight: µ unknown. With µ unknown, a conservative interpretation would be to find the smallest k such that for all µ, P (900µ − k ≤ X ≤ 900µ + k) ≥ 0.95. (... Analysis continued on the next page ...)

92

This is slightly different from what actually happened: (1) The actual random sampling was most likely without replacement (which would change the maths slightly). (2) 100 votes were taken from each of 9 different polling stations (which would also change the maths slightly).

Page 928, Table of Contents

www.EconsPhDTutor.com

(... Analysis continued from the previous page ...) Observe that V [X] = 900µ(1 − µ) is maximised at µ = 0.5. Thus, it is plausible93 that if k satisfies X ∼ B (900, 0.5) Ô⇒ P (900 × 0.5 − k ≤ X ≤ 900 × 0.5 + k) ≥ 0.95,

then k also satisfies

X ∼ B (900, µ) Ô⇒ P (900 × 0.5 − k ≤ X ≤ 900 × 0.5 + k) ≥ 0.95.

Our problem thus boils down to finding the smallest k such that for X ∼ B (900, 0.5) implies P (450 − k ≤ X ≤ 450 + k) ≥ 0.95.

We have

P (450 − 29 ≤ X ≤ 450 + 29) ≈ 0.9508, P (450 − 28 ≤ X ≤ 450 + 28) ≈ 0.9426.

We conclude that the smallest such k is 29. Now, 29/900 ≈ 3.2%. So the margin of error may be given as ±3.2%. This is the same as what was calculated above, which is not surprising, since 9142/23570 ≈ 0.388 is close to 0.5.

The reader will, of course, wonder why the Elections Department stated that the margin of error was ±4%, rather than ±3.2% as I calculated here. I am not sure myself. My guess is that they probably don’t bother going through all the above calculations afresh each time. Instead, each time they report a sample count, they simply read off the margin of error from a table that looks something like this: Sample Size Approximate Margin of Error 400 − 599 ±5% 600 − 999 ±4% 1000 − 2000 ±3%

(By the way, note that it is common to use the CLT approximation when calculating the margin of error. I have not done so here. Instead, I’ve stuck with using the original, exact binomial distribution.) 93

Proving this would need a little work though.

Page 929, Table of Contents

www.EconsPhDTutor.com

85.10

Correlation and Linear Regression

Fact 105. Let x1 , x2 , . . . , xn and y1 , y2 , . . . , yn be numbers. Let x¯ = ∑ xi /n and y¯ = ∑ yi /n. Then ∑i=1 (xi − x¯) (yi − y¯) ∈ [−1, 1]. √ √ 2 2 n n ∑i=1 (xi − x¯) ∑i=1 (yi − y¯) n

Proof. Let u = (x1 − x¯, x2 − x¯, . . . , xn − x¯) and v = (y1 − y¯, y2 − y¯, . . . , yn − y¯) be n-dimensional vectors. Then n u⋅v ∑i=1 (xi − x¯) (yi − y¯) . = √ √ 2 2 ∣u∣ ∣v∣ n n ∑i=1 (xi − x¯) ∑i=1 (yi − y¯)

But from what we learnt about vectors,94 if θ is the angle between two vectors,

Since cos θ ∈ [−1, 1], the result follows.

94

cos θ =

u⋅v . ∣u∣ ∣v∣

Of course, in this textbook, we’ve only shown that this is true for two- and three-dimensional vectors. But let’s just wave our hands and say that this is also true for higher-dimensional vectors.

Page 930, Table of Contents

www.EconsPhDTutor.com

Fact 85 (reproduced from p. 732). Let (x1 , x2 , . . . , xn ) and (y1 , y2 , . . . , yn ) be two ordered sets of data. The OLS regression line of y on x is y − y¯ = ˆb (x − x¯), where ∑ (xi − x¯) (yi − y¯) (i) ˆb = i=1 n , 2 ∑i=1 (xi − x¯) n

xy¯ ∑ xi yi − n¯ . (ii) ˆb = x2 ∑ x2i − n¯

Moreover, the regression line can also be written in the form y = a ˆ + ˆbx, where ˆb is a given above and a ˆ = y¯ − ˆb¯ x.

Proof. (Continued from the proof begun on p. 732.) Remember that the data (x1 , x2 , . . . , xn ) and (y1 , y2 , . . . , yn ) are given. Thus, we can treat all the xi s and yi s as constants. We have: ∂ uˆi ∂ ∂ ui ) = ∑ −2 [yi − (ˆ a + ˆbxi )] . ∑ uˆ2i = ∑ uˆ2i = ∑ (2ˆ ∂ˆ a ∂ˆ a ∂ˆ a

Thus,

∂ 1 a + ˆbxi ) = 0 ⇐⇒ a ˆ = y¯ − ˆb¯ x. ∑ uˆ2i = 0 ⇐⇒ yi − (ˆ ∂ˆ a

We also have:

∂ ∂ uˆi ∂ ui ) = ∑ −2xi [yi − (ˆ a + ˆbxi )] . ∑ uˆ2i = ∑ uˆ2i = ∑ (2ˆ ˆ ˆ ˆ ∂b ∂b ∂b ∂ 1 a + ˆbxi )] xi = 0. Plugging = into this last equation, we ∑ uˆ2i = 0 ⇐⇒ ∑ [yi − (ˆ ∂ˆb have ∑ [yi − (¯ y − ˆb¯ x + ˆbxi )] xi = 0. Tedious algebra yields Formula (ii): Thus,

More algebra yields Formula (i).

Page 931, Table of Contents

xy¯ ˆb = ∑ xi yi − n¯ . 2 x2 ∑ xi − n¯

www.EconsPhDTutor.com

85.10.1

Deriving a Linear Model from the Barometric Formula

According to NASA (1976), “U.S. Standard Atmosphere”, p. 12, eq. (33a) (PDF), the barometric formula (relating pressure P to height H above sea level), in the case where LM,b ≠ 0 is given by:

′

P = Pb [

TM,b ] TM,b + LM,b (h − hb )

′ g0 M ∗ R LM,b

,

where Pb , TM,b , LM,b , hb , g0 , R∗ are simply constants. Now, do the algebra: P = Pb [ = Pb [

TM,b ] TM,b + LM,b (h − hb )

TM,b + LM,b (h − hb ) ] TM,b

= PM,b [1 +

′ g0 M R∗ LM,b ′ g M

− R∗ 0L

M,b

′ g0 M − R∗ L M,b

LM,b (h − hb )] TM,b ′

LM,b gM ln [1 + (h − hb )] . ln P = ln PM,b − ∗0 R LM,b TM,b Now, for heights up to 11, 000 m above sea level, hb is simply the height at sea level. That ′ g0 M is, hb = 0 m. If we also let a = ln PM,b and b = − ∗ and get rid of the subscripts in LM,b R LM,b and TM,b (just to make it neater), then we have: ln P = a + b ln (1 +

L h) . T

For heights up to 11, 000 m above sea level, L = −0.00065 kelvin per metre is the temperature lapse rate (the rate at which the temperature falls, as we go up in altitude; see p.3, Table 4) and T = 288.15 kelvin is the standard sea-level temperature (also precisely equal to 15 °C; see p. 4).

Page 932, Table of Contents

www.EconsPhDTutor.com

Part IX

Answers to Exercises My answers here are often more verbose than what would be necessary for you to get the full credit on an exam. The reason is to help you understand my answers better.

Page 933, Table of Contents

www.EconsPhDTutor.com

86

Answers to Exercises in Part I: Functions and Graphs 86.1

Answers to Exercises in Ch. 1: Sets

Answer to Exercise 1. (Tip: Click on number to return to that exercise.) The set of the first seven integers is B = {1, 2, 3, 4, 5, 6, 7}.

Answer to Exercise 2. There is only one even prime number, namely 2. Hence, C = {2}.

Answer to Exercise 3. The set W = {Apple, Apple, Apple, Banana, Banana, Apple} has only two distinct elements. Hence, n(W ) = 2. We can rewrite the set more simply as W = {Apple, Banana}. Answer to Exercise 4. n(C) = 1.

There is only one even prime number, namely 2. Hence,

Answer to Exercise 5. D is the set containing the first 50 odd positive integers; hence, n(D) = 50. And T is the set containing the first 99 negative integers; hence, n(T ) = 99. Answer to Exercise 6. The set of all primes is H = {2, 3, 5, 7, 11, 13, 17, 23, 29, . . . }.

Answer to Exercise 7. If U = {−1, 0, 2}, then U + = {2}, U − = {−1}, U0 = {−1, 0, 2}, U0+ = {0, 2}, and U0− = {−1, 0}.

Answer to Exercise 8. The set Z = [1, 1] contains only one element, namely the number 1. So actually, we can also write Z = {1}. Answer to Exercise 9. The set Y = (1, 1) contains no elements. So actually, we can also write Y = ∅.

Answer to Exercise 10. The set X = (1, 1.01) contains infinitely many elements, namely all the real numbers that are greater than 1 but smaller than 1.01.

Answer to Exercise 11. R−0 = (−∞, 0].

R = (−∞, ∞), R+ = (0, ∞), R+0 = [0, ∞), R− = (−∞, 0), and

Answer to Exercise 12. (a) Every integer is also a rational number and a real number; hence, Z ⊆ Q, R. (b) A rational number is also a real number; hence, Q ⊆ R. However, some rational numbers are not integers (e.g. 1.5 is rational but is not an integer); hence, Q ⊆/ Z. (c) Some real numbers are neither rational nor integers (e.g. π); hence, R ⊆/ Z, Q. Answer to Exercise 13. True. The set of currently-serving Singapore Prime Ministers if {Lee Hsien Loong}. The set of currently-serving Singapore Ministers is {Lee Hsien Loong, Tharman, Teo Chee Hean, Khaw Boon Wan, . . . }. The latter set contains every element that is in the former set. Hence, the former is a subset of the latter.

Answer to Exercise 14. Yes, the set of squares is a proper subset of the set of rectangles. All squares are rectangles and so S ⊆ R. Moreover, some rectangles are not squares and so S ≠ R. Altogether then, by Definition 6, S ⊂ R. Page 934, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 15. No, that one set is a subset of another does not imply that the former is also a proper subset of the latter. It may be that the two sets are equal. For example, if A = {1, 2} and B = {1, 2}, then A ⊆ B, but A ⊂/ B. Answer to Exercise 16. Yes. By definition, A ⊂ B requires that A ⊆ B.

Answer to Exercise 17. True: “If A is a subset of B, then A is either a proper subset of or is equal to B.” Answer to Exercise 18. (a) [1, 2] ∪ [2, 3] = [1, 3]. (b) (−∞, −3) ∪ [−16, 7) = (−∞, 7). (c) {0} ∪ Z+ = Z+0 .

Answer to Exercise 19. S ∪ R = R. In words, “the set of all squares and all rectangles” is itself simply “the set of all rectangles”. Answer to Exercise 20. All real numbers are either rational or irrational. Hence, the set of all rationals and irrationals is itself simply “the set of all reals” or R. Answer to Exercise 21. (a) (4, 7] ∩ (6, 9) = (6, 7]. (b) [1, 2] ∩ [5, 6] = ∅. (c) (−∞, −3) ∩ [−16, 7) = [−16, −3). Answer to Exercise 22. S ∩ R = S. In words, the intersection of these two sets is simply itself the set of all squares. This is because the only objects that are BOTH squares AND rectangles are squares.

Answer to Exercise 23. It is the empty set (∅). This is because there is no object that is BOTH rational AND irrational. Answer to Exercise 24. R− = {x ∈ R ∶ x < 0}, Q− = {x ∈ Q ∶ x < 0}, and Z− = {x ∈ Z ∶ x < 0}. Also, R−0 = {x ∈ R ∶ x ≤ 0}, Q−0 = {x ∈ Q ∶ x ≤ 0}, and Z−0 = {x ∈ Z ∶ x ≤ 0}.

Answer to Exercise 25. (a, b) = {x ∈ R ∶ a < x < b}, [a, b] = {x ∈ R ∶ a ≤ x ≤ b}, (a, b] = {x ∈ R ∶ a < x ≤ b}, and [a, b) = {x ∈ R ∶ a ≤ x < b}.

Answer to Exercise 26. The set of all living Singapore Prime Ministers (current or former) is X = {Goh Chok Tong, Lee Hsien Loong}.

Answer to √ ∈ R ∶ −∞ < x < −3, 5 < x < ∞}. √ Exercise 27. (a) (−∞, −3) ∪ (5, ∞) = {x (b) (−∞, 2] ∪ (e, π) ∪ (π, ∞) = {x ∈ R ∶ −∞ < x < 2, e < x < π, π < x < ∞}. (c) (−∞, 3) ∩ (0, 7) = {x ∈ R ∶ 0 < x < 3}.

86.2

Answers to Exercises in Ch. 2: Dividing by Zero

Answer to Exercise 28. The error is in Step #5. Since x = y, we have x − y = 0. Hence, we cannot divide both sides by x − y.

Page 935, Table of Contents

www.EconsPhDTutor.com

86.3

Answers to Exercises in Ch. 3: Functions

Answer to Exercise 29. f (1) = 1 + 1 = 2, g(1) = 17(1) = 17, and h(1) = 31 = 3. i(1) is simply undefined because 1 is not in the domain Z− = {−1, −2, −3, ...}.

Answer to Exercise 30. (i) Yes. (ii) Every element in the domain is assigned to exactly one element in the codomain. Specifically, 5 ↦ 10, 6 ↦ 12, and 7 ↦ 14. (iii) The function f ∶ {5, 6, 7} → {⋅ ⋅ ⋅ − 6, −4, −2, 0, 2, 4, 6, . . . } is defined by x ↦ 2x (or alternatively, f (x) = 2x). Answer to Exercise 31. (i) No. (ii) By the rule, the function would map 0 to 3 and/or 4. Thus, this violates the requirement that every element in the domain is assigned to exactly one element in the codomain, because the element 0 in the domain is assigned to more than one element in the codomain. (iii) NA.

Answer to Exercise 32. (i) No. (ii) By the rule, the function would map 2 to no element in the codomain; and 4 to 3. Thus, this violates the requirement that every element in the domain is assigned to exactly one element in the codomain, because the element 2 in the domain is not assigned to any element in the codomain. (iii) NA. Answer to Exercise 33. (i) Yes. (ii) By the rule, the function would simply map 1 in the domain to 1 in the codomain. And so every element in the domain is assigned to one (and exactly one) element in the codomain, as required. (iii) The function f ∶ {1} → {1} is defined by x ↦ x (or alternatively, f (x) = x).

Answer to Exercise 34. (i) Yes. (ii) By the rule, the function would simply map 1 in the domain to 1 in the codomain. And so every element in the domain is assigned to one (and exactly one) element in the codomain, as required. (iii) The function f ∶ {1} → {1, 2} is defined by x ↦ x (or alternatively, f (x) = x).

Answer to Exercise 35. (i) No. (ii) By the rule, the function would map 1 in the domain to 1 in the codomain, but it would fail to map 2 in the domain to any element in the codomain. This fails the requirement that every element in the domain is assigned to exactly one element in the codomain. (iii) NA. Answer to Exercise 36. (i) No. √ (ii) By the rule, the function would map −1 to no element in the codomain, because −1 ∉ R. Thus, this violates the requirement that every element in the domain is assigned to exactly one element in the codomain, because the element -1 in the domain is not assigned to any element in the codomain. (iii) NA. Answer to Exercise 37. (i) No. (ii) By the rule, the function would map 0 to no element 1 in the codomain, because is undefined and is thus not in the codomain. This violates 0 the requirement that every element in the domain is assigned to exactly one element in the codomain, because the element 0 in the domain is not assigned to any element in the codomain. (iii) NA.

Page 936, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 38. Let the domain instead be R+0 . Then now the domain simply consists of non-negative real numbers. And the square root of any non-negative real number is simply itself a real number. And so indeed, every element in the domain can be mapped to exactly one element in the codomain. Formally we’d say, “The function f ∶ R+0 → R is √ √ defined by x ↦ x (or alternatively, f (x) = x).”

Answer to Exercise 39. Let the domain instead be R+ ∪R− . That is, the domain consists of all real numbers except 0. The reciprocal of any real number (other than 0) is simply some other real number. And so indeed, every element in the domain can be mapped to exactly one element in the codomain. Formally we’d say, “The function f ∶ R+ ∪ R− → R is 1 1 defined by x ↦ (or alternatively, f (x) = ).” x x Answer to Exercise 40. f (R+0 ) = R+0 .

Answer to Exercise 41. f (Z) = {0, 1, 4, 9, 16, 25, 36, . . . }.

Answer to Exercise 42. Only (b) is true: “The range of any function is a subset of its codomain.” The range of a function need not be a subset of its domain, so (a) is false. The range of a function is often but not always a proper subset of its codomain, so (c) is false.

Page 937, Table of Contents

www.EconsPhDTutor.com

√ Answer to Exercise 43. (a) To check whether the function f ∶ R+0 → R defined by x ↦ x is one-to-one, we need to show that every element y in the range corresponds to exactly one element x in the codomain. To this end, pick any element y in the range and write: y=

√

x ⇐⇒ y 2 = x.

Thus, indeed, this function is one-to-one — every element y in the range corresponds to exactly one element in the domain, namely y 2 . (b) To check whether the function g ∶ R+0 → R defined by x ↦ x2 is one-to-one, we need to show that every element y in the range corresponds to exactly one element x in the codomain. To this end, pick any element y in the range and write: y = x2 ⇐⇒ ±

√

y = x.

√ The domain consists of only non-negative reals. And so it is impossible that x = − y. So this function is indeed one-to-one —every element y in the range corresponds to exactly √ one element in the domain, namely y. (c) The function h ∶ R → R defined by x ↦ ∣x∣ is not one-to one — for example, 23 in the range is “hit” once by 23 and again by −23.

(d) To check whether the function i ∶ R+0 → R defined by x ↦ ∣x∣ is one-to-one, we need to show that every element y in the range corresponds to exactly one element x in the codomain. To this end, pick any element y in the range and write: ⎧ ⎪ ⎪ ⎪x, y = ∣x∣ ⇐⇒ y = ⎨ ⎪ ⎪ ⎪ ⎩−x,

if x ≥ 0,

if x < 0.

The domain consists of only non-negative reals. And so it is impossible that x < 0. So this function is indeed one-to-one — every element y in the range corresponds to exactly one element in the domain, namely y. (e) The function j ∶ R → R defined by x ↦ sin x is not one-to one — for example, 0 is “hit” by infinitely many elements in the domain, namely . . . , −2π, −π, 0, π, 2π, . . . . This is because sin(−2π) = 0, sin(−π) = 0, etc.

Page 938, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 44 (a). 1. The function f ∶ R+0 → R defined by x ↦ domain R+0 .

√

x has range R+0 . So the inverse function has

2. The domain of f is R+0 . So the inverse function has codomain R+0 . 3. Pick any element y in the range and write: y =f (x) ⇐⇒ y =

√ x ⇐⇒

So f −1 has mapping rule y ↦ y 2 .

f

y 2 = x. ® −1 (y)

(b).

1. The function g ∶ [−0.5π, 0.5π] → R defined by x ↦ sin x has range [−1, 1]. So the inverse function has domain [−1, 1].

2. The domain of g is [−0.5π, 0.5π]. So the inverse function has codomain [−0.5π, 0.5π]. 3. Pick any element y in the range and write: y = g(x) ⇐⇒ y = sin x ⇐⇒ sin−1 y = x. ´¹¹ ¹ ¸¹ ¹ ¹ ¶ g −1 (y)

So g −1 has mapping rule y ↦ sin−1 y. (For a brief review of the arcsine function, see Section 26.6 and the sections that follow.) (c).

1. The function h ∶ R → R defined by x ↦ x3 has range R. So the inverse function has domain R. 2. The domain of h is R. So the inverse function has codomain R. 3. Pick any element y in the range and write:

y = h(x) ⇐⇒ y = x3 ⇐⇒ So h−1 has mapping rule y ↦

Page 939, Table of Contents

√ 3 y.

√ 3

y = x. ° −1

h (y)

www.EconsPhDTutor.com

Answer to Exercise 45. Pick y = 1 in the range. It corresponds to two elements in the domain, namely 2 and −1. And so this function is not one-to-one. Now restrict the domain of the function f to create the function g ∶ (1, ∞) → R defined by 1 x↦ . (x − 1)2

1. The function g has range (0, ∞). So the inverse function has domain (0, ∞). 2. The domain of g is (1, ∞). So the inverse function has codomain (1, ∞). 3. Pick any element y in the range and write: y

= g(x)

⇐⇒ x − 1 = ±

√

⇐⇒ y =

1 (x − 1)2

1 ⇐⇒ x = 1 ± y

√

⇐⇒ (x − 1)2 =

1 (∵y ≠ 0) y

1 . y

−1 −1 We know that the domain √ of g — and hence the codomain of g — is (1, ∞). So h has 1 . mapping rule y ↦ 1 + y

Answer to Exercise 46. Given the function f ∶ R → R defined by f ∶ x ↦ x2 , we restrict its domain to [20, 30] to obtain a brand new function g ∶ [20, 30] → R defined by x ↦ x2 .

1. The function g has range [400, 900]. So the inverse function has domain [400, 900]. 2. The domain of g is [20, 30]. So the inverse function has codomain [20, 30]. 3. Pick any element y in the range and write:

y = g(x) ⇐⇒ y = x2 ⇐⇒ ±

√

y = x.

We know that the √ domain of g — and hence the codomain of g −1 — is [20, 30]. So g −1 has mapping rule y ↦ y.

Page 940, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 47. (a) The range of g is [1, ∞) and this is indeed a subset of the domain of f (which is R). So the composite function f g ∶ R → R exists and is defined by 2 2 2 x ↦ f (g(x)) = f (x2 + 1) = ex +1 . Thus, f g(1) = e1 +1 = e2 and f g(2) = e2 +1 = e5 .

(b) The range of g is R+ and this is indeed a subset of the domain of f (which is R). So the 2 composite function f g ∶ R → R exists and is defined by x ↦ f (g(x)) = f (ex ) = (ex ) + 1 = e2x + 1. Thus, f g(1) = e2(1) + 1 = e2 + 1 and f g(2) = e2(2) + 1 = e4 + 1.

(c) The range of g is R− ∪ R+ and this is indeed a subset of the domain of f (which is R− ∪ R+ ). So the composite function f g ∶ R− ∪ R+ → R− ∪ R+ exists and is defined by 1 1 x ↦ f (g(x)) = f ( ) = 1/ ( ) = 2x. Thus, f g(1) = 2(1) = 2 and f g(2) = 2(2) = 4. 2x 2x

(d) The range of g is R− ∪ R+ and this is indeed a subset of the domain of f (which is R− ∪ R+ ). So the composite function f g ∶ R− ∪ R+ → R− ∪ R+ exists and is defined by 1 1 x 1 2 x ↦ f (g(x)) = f ( ) = = . Thus, f g(1) = and f g(2) = = 1. x 2 × 1/x 2 2 2

Answer to Exercise 48. (a) The range of f is R+ and this is indeed a subset of the domain of f (which is R). So the composite function f 2 ∶ R → R exists and is defined by x 1 2 x ↦ f (f (x)) = ef (x) = ee . Hence, f 2 (1) = ee and f 2 (2) = ee .

(b) The range of f is R and this is indeed a subset of the domain of f (which is R). So the composite function f 2 ∶ R → R exists and is defined by x ↦ f (f (x)) = 3f (x) + 2 = 3(3x + 2) + 2 = 9x + 8. Hence, f 2 (1) = 17 and f 2 (2) = 26.

(c) The range of f is [1, ∞) and this is indeed a subset of the domain of f (which is R). So the composite function f 2 ∶ R → R exists and is defined by x ↦ f (f (x)) = 2 [f (x)] + 1 = 2(2x2 + 1)2 + 1 2

= 2(4x4 + 4x2 + 1) + 1 = 8x4 + 8x2 + 3.

Hence, f 2 (1) = 8 + 8 + 3 = 19 and f 2 (2) = 8(16) + 8(4) + 3 = 163.

Page 941, Table of Contents

www.EconsPhDTutor.com

86.4

Answers to Exercises in Ch. 4. Graphs

Answer to Exercise 49. (a) No, it is impossible to rewrite the equation x2 + y 2 = 1 into the form of a single function. For every value of x, there can be two corresponding values of y. For example, if x = 0, then either y = −1 or y = 1 will satisfy the equation. There is thus no way to write y as a function of x. Conversely, for every value of y, there can likewise be two corresponding values of x. There is thus no way to write x as a function of y. (b) Although it is impossible to rewrite the equation x2 + y 2 = 1 into the form of a single function, it is nonetheless possible Namely, √ √ to rewrite it into the form of two functions. 2 2 f ∶ [−1, 1] → R defined by x ↦ 1 − x and g ∶ [−1, 1] → R defined by x ↦ − 1 − x . These are depicted in Example 73. Answer to Exercise 50 (a). The graph of the equation y = ex :

8

y

7 6 5 4 y = ex 3 2 1 x 0 -2

Page 942, Table of Contents

-1

0

1

2

www.EconsPhDTutor.com

Answer to Exercise 50 (b). The graph of the equation y = 3x + 2:

8

y

7 6 5 y = 3x + 2 4 3 2 1 x 0 -2

-1

0

1

2

-1 -2 -3 -4

Page 943, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 50 (c). The graph of the equation y = 2x2 + 1:

10

y 9 8 y = 2x2 + 1

7 6 5 4 3 2 1 x 0

-2

Page 944, Table of Contents

-1

0

1

2

www.EconsPhDTutor.com

86.5

Answers to Exercises in Ch. 5. Quick Revision

Answer to Exercise 51. (53x ⋅ 52(1−x) ) (53x ⋅ 251−x ) = ∵ 25 = 52 52x+1 + 3(25x ) + 17(52x ) 52x+1 + 3(25x ) + 17(52x ) 52+x = 2x+1 Add the exponents 5 + 3(25x ) + 17(52x ) 52+x = 2x+1 ∵ 25 = 52 2x 2x 5 + 3(5 ) + 17(5 ) 52+x = 2x 1 Factorise out 52x 5 (5 + 3 + 17) 1 52+x 5x = x = 2x = 2x = 5−x . 5 5 (25) 5

(8x+2 − 34(23x )) (8x+2 − 34(23x )) = √ 2x √ 1 Splitting out the exponents √ 2x+1 ( 8) ( 8) ( 8) (8x+2 − 34(23x )) (8x+2 − 34(8x )) = = x √ x √ (8) ( 8) (8) ( 8) (8x ) (82 − 34) = Factorise out the 8x x √ (8) ( 8) (82 − 34) (64 − 34) √ √ = = ( 8) ( 8) 30 30 15 =√ = √ =√ . 8 2 2 2

Page 945, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 52. (i) x(a ) = xab is false. Here’s a counter-example. Let x = 2, b 4 a = 3, b = 4. Then x(a ) = 2(3 ) = 281 , but xab = 23×4 = 212 – the two are clearly not equal. b

(ii) (xa ) = xab is true, as we now prove: b

b times ³¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ·¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ µ b (xa ) = (xa ) ⋅ (xa ) ⋅ ⋅ ⋅ ⋅ ⋅ (xa )

b times ³¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ·¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ µ a times ⎛³¹¹ ¹a¹ ¹ ¹ ¹ ¹ ¹times ⎛³¹¹ ¹a¹ ¹ ¹ ¹ ¹ ¹times ¹ ¹ ·¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ µ⎞ ⎛³¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ·¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ µ⎞ ¹ ¹ ·¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ µ⎞ = ⎜x ⋅ x ⋅ ⋅ ⋅ ⋅ ⋅ x⎟ ⋅ ⎜x ⋅ x ⋅ ⋅ ⋅ ⋅ ⋅ x⎟ ⋅ ⋅ ⋅ ⋅ ⋅ ⎜x ⋅ x ⋅ ⋅ ⋅ ⋅ ⋅ x⎟ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠

= xab .

Answer to Exercise 53.

x y

+

1 √

x2 y2

+1

= = =

=

=

Page 946, Table of Contents

( xy + x y

( xy ) x y

−

2

√ −

x y x2 y2

√

−(

√

+1 √ 2 x + 1) ( y − xy2 + 1) −

x2 y2

√

x2 y2

√

x2 y2

+1

x2 y2

+1

+ 1)

2

− ( xy2 + 1) √ 2 x x y − y2 + 1

x2 y2

√

2

−1

x2 x + 1 − . y2 y

www.EconsPhDTutor.com

86.6

Answers to Exercises in Ch. 6. Intercepts

Answer to Exercise 54. (a) The graph of the equation x2 +y 2 = 1 intersects the horizontal axis at the points (−1, 0) and (1, 0); and intersects the vertical axis at the points (0, −1) and (0, 1).

(b) The graph of the equation y = x2 − 4 intersects the horizontal axis at the points (−2, 0) and (2, 0); and intersects the vertical axis at the point (0, −4).

(c) The graph of the equation y = x2 + 2x + 1 intersects the horizontal and vertical axes at the point (0, 0). (d) The graph of the equation y = x2 + 2x + 2 does not intersect the horizontal axis, but does intersect the vertical axis at the point (0, 2).

Page 947, Table of Contents

www.EconsPhDTutor.com

86.7

Answers to Exercises in Ch. 7. Symmetry

Answer to Exercise 55. (a) Given the point (3, 17), its reflection in the line y = x is (17, 3) and its reflection in the line y = −x is (−17, −3).

(b) Given the point (−1, 5), its reflection in the line y = x is (5, −1) and its reflection in the line y = −x is (−5, 1). (c) Given the point (0, 0), its reflection in the line y = x is (0, 0) and its reflection in the line y = −x is (0, 0).

Page 948, Table of Contents

www.EconsPhDTutor.com

86.8

Answers to Exercises in Ch. 8. Limits, Continuity, and Asymptotes

⎧ ⎪ ⎪ ⎪1, Answer to Exercise 56. Given f ∶ R → R defined by f (x) = ⎨ ⎪ ⎪ ⎪ ⎩2, lim f (x) = 1,

x→−5

Page 949, Table of Contents

lim f (x) is undefined, x→0

if x ≤ 0, if x > 0,

we have

and lim f (x) = 2. x→5

www.EconsPhDTutor.com

86.9

Answers to Exercises in Ch. 9. Differentiation

Answer to Exercise 57 Let B be the set of points at which f ′ is differentiable. Then f ′′ is the function with domain B, codomain R, and mapping rule x ↦ f ′′ (x). Answer to Exercise 58. Given g ∶ R → R defined by x ↦ x4 −x3 +x2 −x+1, the derivative of g is the function with domain and codomain both R and mapping rule x ↦ 4x3 −3x2 +2x−1. dg dg ⋅ ⋅ or g. Evaluated at 1, we have g ′ (1) = ∣ = g(1) = 2. It may be denoted g ′ or dx dx x=1 The 2nd derivative of g is the function with domain and codomain both R and mapping d2 g ⋅⋅ rule x ↦ 12x2 − 6x + 2. It may be denoted g ′′ or or g. Evaluated at 1, we have dx2 d2 g ⋅⋅ ′′ g (1) = 2 ∣ = g(1) = 8. dx x=1 The 3rd derivative of g is the function with domain and codomain both R and mapping 3 d3 g ⋅ (3) rule x ↦ 24x − 6. It may be denoted g or g. Evaluated at 1, we have g (3) (1) = or 3 dx 3 d3 g ⋅ ∣ = g(1) = 18. dx3 x=1 The 4th derivative of g is the function with domain and codomain both R and mapping 4 d4 g d4 g ⋅ (4) rule x ↦ 24. It may be denoted g or 4 or g. Evaluated at 1, we have g (4) (1) = 4 ∣ = dx dx x=1 4 ⋅

g(1) = 24.

For n ≥ 5, the nth derivative of g is the function with domain and codomain both R n dn g ⋅ and mapping rule x ↦ 0. It may be denoted g (n) or or g. Evaluated at 1, we have n dx n dn g ⋅ (n) g (1) = n ∣ = g(1) = 0. dx x=1

Page 950, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 59. (a) f ′ (x) = 2x, so f ′ (0) = 0. (b) g ′ (x) = 2 [x − ln (x + 1)] (1 −

(c) Observe that h(x) = sin

1 ). So g ′ (0) = 0. x+1

x . So using the chain rule and then the quotient rule, g(x)

x d x x g(x) − xg ′ (x) h (x) = [cos ]⋅[ ] = [cos ]⋅ . 2 g(x) dx g(x) g(x) [g(x)] ′

0 g(0) − 0g ′ (0) 1−0 Since g(0) = 1 and g (0) = 0, we have h (0) = [cos ]⋅ = 1 ⋅ = 1. 2 g(0) 12 [g(0)] ′

′

Answer to Exercise 60.

d cos x sin x(− sin x) − cos x cos x − sin2 x − cos2 x d −1 cot x = = = = = − csc2 x. 2 2 2 dx dx sin x sin x sin x sin x d 1 0 − cos x 1 cos x d csc x = = =− = − csc x cot x. 2 dx dx sin x sin x sin x sin x

Answer to Exercise 61. (a) Newton’s Second Law of Motion is F = force is equal to the rate of change of momentum.) (b) Using the Product Rule, F =

d dm dv (mv) = v+m . dt dt dt

Now, under the assumption that mass is constant, we have

d (mv). (In words, dt

dm dv = 0, so that F = m . dt dt

Acceleration (a) is defined as the rate of change of velocity. Hence, F = ma. Answer to Exercise 62.

d d 1 0 − (− sin x) 1 sin x sec x = = = = sec x tan x. dx dx cos x cos2 x cos x cos x

d Rewrite y = cos−1 x as x = cos y and then apply (implicit differentiation) to get 1 = dx √ dy dy d −1 −1 − sin y . But sin2 y +cos2 y = 1, so sin y = 1 − x2 . Thus, = sin−1 x = =√ . dx dx dx sin y 1 − x2 d Rewrite y = tan−1 x as x = tan y and then apply (implicit differentiation) to get 1 = dx dy dy d 1 1 sec2 y . But 1 + tan2 y = sec2 y, so = sin−1 x = = . dx dx dx sec2 y 1 + x2 Page 951, Table of Contents

www.EconsPhDTutor.com

86.10

Answers to Exercises in Ch. 11. Stationary, Maximum, Minimum, and Inflexion Points

π π For every k ∈ Z, g is increasing on [− + 2kπ, + 2kπ] 2 2 π 3π π π decreasing on [ + 2kπ, + 2kπ], strictly increasing on (− + 2kπ, + 2kπ), and strictly 2 2 2 2 3π π + 2kπ). decreasing on ( + 2kπ, 2 2 Answer to Exercise 63.

Answer to Exercise 64.

(a) Given the function f ∶ R → R defined by x ↦ 100 ...

(i) Every point a ∈ R is a maximum point with corresponding maximum value f (a) = 100;

(ii) Every point a ∈ R is a minimum point with corresponding maximum value f (a) = 100; (iii) No point is a strict maximum; (iv) No point is a strict minimum; (v) Every point a ∈ R is a global maximum point with corresponding global maximum value f (a) = 100;

(vi) Every point a ∈ R is a global minimum point with corresponding global maximum value f (a) = 100; (vii) No point is a strict global maximum;

(viii) No point is a strict global minimum. (b) Given the function g ∶ R → R defined by x ↦ x2 ... (i) No point is a maximum;

(ii) Only x = 0 is a minimum point with corresponding minimum value g(0) = 0; (iii) No point is a strict maximum;

(iv) Only x = 0 is a strict minimum point with corresponding strict minimum value g(0) = 0;

(v) No point is a global maximum;

(vi) Only x = 0 is a global minimum point with corresponding global minimum value g(0) = 0;

(vii) No point is a strict global maximum;

(viii) Only x = 0 is a strict global minimum point with corresponding strict global minimum value g(0) = 0. Page 952, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 64. (c) Given the function h ∶ [1, 2] → R defined by x ↦ x2 ... (i) Only x = 2 is a maximum with corresponding maximum value h(2) = 4;

(ii) Only x = 1 is a minimum point with corresponding minimum value h(1) = 1;

(iii) Only x = 2 is a strict maximum point with corresponding strict maximum value h(2) = 4;

(iv) Only x = 1 is a strict minimum point with corresponding strict minimum value h(1) = 1;

(v) Only x = 2 is a global maximum with corresponding global maximum value h(2) = 4;

(vi) Only x = 1 is a global minimum point with corresponding global minimum value h(1) = 1;

(vii) Only x = 2 is a strict global maximum point with corresponding strict global maximum value h(2) = 4;

(viii) Only x = 1 is a strict global minimum point with corresponding strict global minimum value h(1) = 1. Answer to Exercise 65. (a) It is false that every maximum point or minimum point is a stationary point — see Points A and E in Example 141. (b) It is also false that every maximum point or minimum point is a turning point — again, see Points A and E in Example 141. (c) It is false that every stationary point is a maximum point or minimum point — see Point D in Example 141. (d) By Definition 140, it is true that every turning point is a maximum point or minimum point. (e) By Definition 140, it is true that every turning point is a stationary point.

(f) It is false that every stationary point is a turning point — again, see Point D in Example 141. Answer to Exercise 66. In order for −1 to be a minimum point of g, it must be that to its left, g is decreasing; while to its right, g is increasing. In other words, to the left of −1, g ′ (x) ≤ 0. While to the right of −1, g ′ (x) ≥ 0. Altogether then, we must have g ′ (−1) = 0 — at the minimum point, the slope of the function must be 0. Answer to Exercise 67. “If c is a maximum or minimum point AND in the interior of D, then c is a turning point” — true! By the IET, c is a stationary point. Since c is also either a maximum or a minimum point, by Definition 44, x is also a turning point. Page 953, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 68. (a) f ∶ R → R defined by x ↦ x.

1. Identify all the stationary points (i.e. x where f ′ (x) = 0). f ′ (x) = 1, for all x. So f has no stationary points. 2. Identify all the non-interior points.

There are no non-interior points because every point x ∈ R is in the interior of R.

3. Check if each of these points is a maximum point, a minimum point, or neither. There are neither stationary nor non-interior points. Hence, there are no maximum or minimum points. (b) g ∶ [0, 1] → R defined by x ↦ x.

1. Identify all the stationary points (i.e. x where f ′ (x) = 0). g ′ (x) = 1, for all x. So f has no stationary points.

2. Identify all the non-interior points.

The only two non-interior points are 0 and 1. 3. Check if each of these points is a maximum point, a minimum point, or neither. 0 is the only minimum point and 1 is the only maximum point of g.

Page 954, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 68. (c) h ∶ R → R defined by x ↦ x4 − 2x2 .

1. Identify all the stationary points (i.e. x where h′ (x) = 0).

h′ (x) = 4x3 − 4x = 4x (x2 − 1) = 4x(x − 1)(x + 1). So the stationary points of h are 0, 1, and −1. 2. Identify all the non-interior points.

There are no non-interior points because every point x ∈ R is in the interior of R.

3. Check if each of these points is a maximum point, a minimum point, or neither. From a sketch of the graph, we see that 0 is a maximum point. And ±1 are minimum points (and also global minimum points).

y

x -2

Page 955, Table of Contents

-1

0

1

2

www.EconsPhDTutor.com

Answer to Exercise 69. (a) g ∶ R → R defined by x ↦ x8 + x7 − x6 .

1. Identify all the stationary points. g ′ (x) = 8x7 + 7x6 − 6x5 = x5 (8x2 + 7x − 6) = 0 ⇐⇒ √ √ −7 ± 72 − 4(8)(−6) −7 ± 241 x = 0, or x = = . 2(8) 16 (a) g ′′ (x) = 56x6 + 42x5 − 30x4 . So g ′′ (0) = 0,

√ √ 241 241 −7 − −7 + g ′′ ( ) > 0, and g ′′ ( ) > 0. 16 16

√ −7 ± 241 (b) So are both minimum points. The 2DT is inconclusive about 0. By 16 sketching the graph, we observe that 0 is an inflexion point (this is an informal argument). 2. There are no non-interior points. Altogether, we √ conclude that there are no maximum points and the only two minimum −7 ± 241 . points are 16 π π (b) h ∶ (− , ) → R defined by x ↦ tan x. 2 2

1. Identify all the stationary points. h′ (x) = sec2 x is never equal to 0, so there are no stationary points. 2. There are no non-interior points. Altogether, we conclude that there are no maximum points and no minimum points. (c) i ∶ [0, 2π] → R defined by x ↦ sin x + cos x.

1. Identify all the stationary points. i′ (x) = cos x − sin x = 0 ⇐⇒ x =

π 5π , . 4 4

π π (a) i′′ (x) = − sin x − cos x. So i′′ ( ) < 0 and i′′ ( ) > 0. 4 4 π 5π (b) So is a maximum point and is a minimum point. 4 4

2. The only two non-interior points are 0 and 2π. The former is a minimum point and the latter is a maximum point. π Altogether, we conclude that the two maximum points are and 2π, and the two minimum 4 5π points are and 2π. 4 Page 956, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 70 (a). (i) The graph below is of the equation y = 2ex + x.

(ii) The graph intersects the vertical and horizontal axes at (0, 2) and (−0.8526, 0) (calculator).

dy = 2ex + 1 is never equal to zero. Hence, there are no stationary points (and thus no dx turning points either). (iii)

(iv) By observation, there are no lines of symmetry.

(v) Observe that as x → −∞, y → x. And so there is a oblique asymptote y = x.

20 y 18 y = 2ex + x

16 14 12 10 8 6

y=x Asymptote

4 2

x 0 -4

-2

-2

0

2

4

-4 -6

Page 957, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 70 (b). (i) The graph below is of the equation y = 3x + 2.

(ii) The point at which the graph intersects the vertical axis is (0, 2) and the point at which 2 it intersects the horizontal axis is (− , 0). 3

dy = 3 is never equal to zero. Hence, there are no stationary points (and thus no dx turning points either).

(iii)

x (iv) By observation, the graph has infinitely many lines of symmetry of the form y = − + k 3 x (for any k ∈ R) — for example, y = 2 − (illustrated below) is a line of symmetry. 3

Another line of symmetry is trivial — the line y = 3x + 2 is its own line of symmetry. Of course, every line is symmetric in itself.

(v) By observation, there are no asymptotes.For each of the following equations, (i) sketch its graph. (ii) Write down the points at which it intersects the axes. (iii) Identify any turning points. (iv) Write down the equations of any lines of symmetry and also (v) asymptotes.

5

y

4 3

y=2-x/3 One of infinite lines of symmetry

2 1 0 -5

-4

-3

-2

-1

0 -1

1

2

3

4

5 x

-2 y = 3x + 2 -3 -4 -5

Page 958, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 70 (c). (i) The graph below is of the equation y = 2x2 + 1.

(ii) The point at which the graph intersects the vertical axis is (0, 1) and there is no point at which it intersects the horizontal axis.

dy = 4x is equal to zero if and only if x = 0. Hence, this is also the only stationary dx point. By observation (or by the second-derivative test), this is a minimum turning point. (iii)

(iv) By observation, the graph is symmetric in the line x = 0 (which is also the vertical axis). (v) By observation, there are no asymptotes.

10 y 9 8 y = 2x2 + 1

7 6 5 4 3 2 1 x 0

-2

Page 959, Table of Contents

-1

0

1

2

www.EconsPhDTutor.com

86.11

Answers to Exercises in Ch. 14. Quadratic Equations

2 Answer to Exercise 71. The graphs of all three equations are below: (a) y = 2x2 + x + 1 2 2 (red). (b) y = −2x2 + x + 1 (blue). (c) y = x2 + 6x + 9 (green). 2 2 (a) Since b2 − 4ac = 12 − 4(2)(1) = 1 − 8 = −7 < 0, there are no horizontal intercepts. The b 1 vertical intercept is c = 1. The turning point is at x = − = − = −0.25. 2a 4

(b) Since b22 − 4ac = 122 − 4(−2)(1) = 1 + 8 = 9 > 0, there are two horizontal intercepts, √ √ −b ± 9 −1 ± 9 namely = = 1, −0.5. The vertical intercept is c = 1. The turning point is 2a −4 at x = − b = −1 = 0.25. 2a −4 2 2 (c) Since b2 − 4ac = 62 − 4(1)(9) = 36 − 36 = 0, there is one horizontal intercept, namely b 6 b − = − = 3. The vertical intercept is c = 9. The turning point is at x = − = 3. 2a 2 2a

Page 960, 960, Table Table of of Contents Contents Page

www.EconsPhDTutor.com www.EconsPhDTutor.com

86.12

Answers to Exercises in Ch. 15. Transformations

Answer to Exercise 72. Below, alongside the graph of f (in red), are the graphs of y = ∣2f (3x)∣ (in green), y = f (∣x − 1∣) (in blue), and (in purple).

(a) To get the graph of y = ∣2f (3x)∣ (in green), stretch the red graph horizontally (out1 wards from the vertical axis) by a factor of , then stretch the new graph vertically (upwards 3 from the horizontal axis) by a factor of 2, and finally reflect all points for which y < 0 on the vertical axis. (b) To get the graph of y = f (∣x − 1∣) (blue), shift the red graph rightwards by 1 unit, then reflect all points for which x < 1 on the vertical line x = 1. 2 (c) To get the graph of y = f (x) + 4 (in purple), shift the red graph upwards by 4 units, then for all points for which f (x) + 4 ≥ 0, take both the positive and negative square roots.

Page 961, 961, Table Table of of Contents Contents Page

www.EconsPhDTutor.com www.EconsPhDTutor.com

Answer to Exercise 73. The series of transformations that would transform the graph 1 1 of y = to y = 3 − is: x 5x − 2

1 1. Stretch the graph horizontally, outwards from the vertical axis, by a scale factor of to 5 1 get the graph of y = . 5x 1 . 2. Move the graph rightward by 2 units to get the graph of y = 5x − 2 1 3. Reflect the graph on the horizontal axis to get the graph of y = − . 5x − 2 1 . 4. Move the graph upward by 3 units to get the graph of y = 3 − 5x − 2

Page 962, Table of Contents

www.EconsPhDTutor.com

86.13

Answers to Exercises in Ch. 16: Conic Sections

x2 y 2 1 Answer to Exercise 74. The equation 2 + 2 = 1 is the special case of the equation = a b 1 1 1 1 where A = 2 , B = 0, C = 2 , D = 0, E = 0, and F = −1. We have B 2 − 4AC = 02 − 4 2 2 < 0, a b a b so that this is indeed an ellipse. x2 y 2 1 1 1 − = 1 is the special case of the equation = where A = , B = 0, C = − , a2 b2 a2 b2 1 −1 D = 0, E = 0, and F = −1. We have B 2 − 4AC = 02 − 4 2 2 > 0, so that this is indeed a a b hyperbola. The equation

y 2 x2 1 1 1 The equation 2 − 2 = 1 is the special case of the equation = where A = − 2 , B = 0, C = 2 , b a a b −1 1 D = 0, E = 0, and F = −1. We have B 2 − 4AC = 02 − 4 2 2 > 0, so that this is indeed a a b hyperbola. ax + b can be rewritten as cxy +dy = ax+b or cxy −ax+dy −b = 0, with the cx + d d ax + b additional condition that x ≠ − (otherwise the denominator is 0). The equation y = c cx + d 1 is thus the special case of the equation = where A = 0, B = c, C = 0, D = −a, E = d, and d F = −b, with the additional condition that x ≠ − . We have B 2 − 4AC = c2 − 4(0)(0) > 0, so c that this is indeed a hyperbola. The equation y =

ax2 + bx + c The equation y = can be rewritten as dxy + ey = ax2 + bx + c or −ax2 + dxy − dx + e e bx + ey − c = 0, with the additional condition that x ≠ − (otherwise the denominator is 0). d ax2 + bx + c 1 The equation y = is thus the special case of the equation = where A = 0, B = c, dx + e e C = 0, D = −a, E = d, and F = −b with the additional condition that x ≠ − . We have d B 2 − 4AC = c2 − 4(0)(0) > 0, so that this is indeed a hyperbola.

Page 963, Table of Contents

www.EconsPhDTutor.com

(y + d) (x + c) + = 1 is simply Answer to Exercise 75. Observe that the equation a2 b2 x2 y 2 the same as the equation for the ellipse 2 + 2 = 1, but shifted leftwards by c units and a b downwards by d units. So it is the exact same ellipse with only one difference: Instead of being centred on the origin (0, 0), it is centred on the point (−c, −d). 2

2

(i) See graph below.

(ii) It intersects the vertical axis at (−c, −d + b) and (−c, −d − b) and the horizontal axis at (−c + a, −d) and (−c − a, −d).

(iii) There is one maximum turning point (−c, −d + b) and one minimum turning point (−c, −d − b). (iv) There are two lines of symmetry y = −d and x = −c. (v) There are no asymptotes.

y

(x + c)2 / a2 + (y + d)2 / b2 = 1

(- c, - d + b) x y = -d Line of Symmetry

(- c - a, - d)

(-c, -d) Centre

x = -c Line of Symmetry

(- c + a, - d)

(- c, - d - b)

Page 964, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 76. (a) Given

16x + 3 , we have 5x − 2

3.2 5x − 2 16x +3 16x −6.6 9.6

The “quotient” is 3.2 and the “remainder” is 4x + 3. Hence,

(b) Given

4x2 − 3x + 1 , we have x+5

16x + 3 9.6 = 3.2 + . 5x − 2 5x − 2

4x −23 x + 5 4x2 −3x +1 4x2 +20x −23x −23x −115 114

The “quotient” is 4x − 23 and the “remainder” is 114. Hence, 114 4x2 − 3x + 1 = 4x − 23 + . x+5 x+5

x2 + x + 3 (c) Given , we have −x2 − 2x + 1

−1 −x − 2x + 1 x2 +x +3 x2 +2x −1 −x +4 2

The “quotient” is −1 and the “remainder” is −x + 4. Hence,

x2 + x + 3 −x + 4 = −1 + . −x2 − 2x + 1 −x2 − 2x + 1

Page 965, Table of Contents

www.EconsPhDTutor.com

3x + 2 Answer to Exercise 77 (a). Graphed below is the equation y = 3x + 2 . By long division, Answer to Exercise 77 (a). Graphed below is the equation y = x + 2 . By long division, x+2 4 3x + 2 y = 3x + 2 = 3 − 4 . y = x + 2 = 3 − x + 2. x+2 x+2

There There are are two two distinct distinct branches. branches. 1. the vertical vertical axis axis at at the the point point (0, (0,1) 1) and and the the horizontal horizontal 1. Intercepts. Intercepts. The The graph graph intersects intersects the axis axis at at the the point point (−2/3, (−2/3, 0). 0). 2. 2. There There are are no no turning turning points. points. 3. ±∞. And And so so xx == −2 −2 is is aa vertical vertical asymptote. asymptote. As As xx → → ±∞, ±∞, 3. Asymptotes. Asymptotes. As As x x→ → −2, −2, yy → → ±∞. yy → asymptote. The The two two asymptotes asymptotes are are perpendicular perpendicular → 3. 3. And And so so yy == 33 is is aa horizontal horizontal asymptote. and and so so this this is is aa rectangular rectangular hyperbola. hyperbola. 4. The centre (point at which two asymptotes asymptotes intersect) intersect) is is (−2, (−2,3). 3). 4. The centre (point at which the the two 5. symmetry bisect bisect the the angles angles formed formed by by the the asymptotes. asymptotes. 5. We We know know that that the the two two lines lines of of symmetry So −1. Moreover, Moreover, both both pass pass through through the the centre centre (−2, (−2,3). 3). So they they must must have have slope slope 11 and and −1. Altogether, the lines lines of of symmetry symmetry are are yy == −x −x ++ 11 and and yy == xx++5. 5. Altogether, we we can can work work out out that that the Page Page 966, 966, Table Table of of Contents Contents

www.EconsPhDTutor.com www.EconsPhDTutor.com

−3x + 1 Answer to Exercise 77(c). Graphed below is the equation y = x − 2 . By long division, Answer to Exercise 77(b). Graphed below is the equation y = 2x + 3 . By long division, −2x + 1 5.5 −3x + 1 y = x − 2 = −1.5 + 1.5 . y = 2x + 3 = −0.5 − 2x + 3 . −2x + 1 −2x + 1

There There are are two two distinct distinct branches. branches. 1. Intercepts. The graph intersects the vertical axis at the point (0, 1/3) and the horizontal 1. Intercepts. The graph intersects the vertical axis at the point (0, −2) and the horizontal axis at the point (1/3, 0). axis at the point (2, 0). 2. There are no turning points. 2. There are no turning points. 3. Asymptotes. As x → −1.5, y → ±∞. And so x = −1.5 is a vertical asymptote. As 3. Asymptotes. As x → 0.5, y → ±∞. And so x = 0.5 is a vertical asymptote. As x → ±∞, y → −1.5. And so y = −1.5 is a horizontal asymptote. The two asymptotes are x → ±∞, y → −0.5. And so y = −0.5 is a horizontal asymptote. The two asymptotes are perpendicular and so this is a rectangular hyperbola. perpendicular and so this is a rectangular hyperbola. 4. The centre (point at which the two asymptotes intersect) is (−1.5, −1.5). 4. The centre (point at which the two asymptotes intersect) is (0.5, −0.5). 5. We know that the two lines of symmetry bisect the angles formed by the asymptotes. 5. We know that the two lines of symmetry bisect the angles formed by the asymptotes. So they must have slope 1 and −1. Moreover, both pass through the centre (−1.5, −1.5). So they must have slope 1 and −1. Moreover, both pass through the centre (0.5, −0.5). Altogether, we can work out that the lines of symmetry are y = x and y = −x − 3. Altogether, we can work out that the lines of symmetry are y = −x and y = x − 1. Page 968, Table of Contents Page 967, Table of Contents

www.EconsPhDTutor.com www.EconsPhDTutor.com

Answer to Exercise 77(c). Graphed below is the equation y = y=

5.5 −3x + 1 = −1.5 + . 2x + 3 2x + 3

−3x + 1 . By long division, 2x + 3

𝑦

𝑥

𝑦 = −1.5 horizontal (−1.5, −1.5) centre

𝑦=

−3𝑥+1 −2𝑥+3

𝑥 = −1.5 vertical asymptote

There are two distinct branches. 1. Intercepts. The graph intersects the vertical axis at the point (0, 1/3) and the horizontal axis at the point (1/3, 0). 2. There are no turning points.

3. Asymptotes. As x → −1.5, y → ±∞. And so x = −1.5 is a vertical asymptote. As x → ±∞, y → −1.5. And so y = −1.5 is a horizontal asymptote. The two asymptotes are perpendicular and so this is a rectangular hyperbola.

4. The centre (point at which the two asymptotes intersect) is (−1.5, −1.5). 5. We know that the two lines of symmetry bisect the angles formed by the asymptotes. So they must have slope 1 and −1. Moreover, both pass through the centre (−1.5, −1.5). Altogether, we can work out that the lines of symmetry are y = x and y = −x − 3.

Page 968, Table of Contents

www.EconsPhDTutor.com

x2 + 2x + 1 Answer to Exercise 78 (a). Consider the equation y = . By long division. x−4 x2 + 2x + 1 25 =x+6+ . x−4 x−4

(4, 10) Centre Maximum Turning Point -16

-12

-8

30 y y = (x2 + 2x + 1) / (x - 4) 28 26 24 y=x+6 22 Oblique 20 Asymptote 18 Minimum 16 Turning 14 Point 12 10 y = (1 - √2) x + 6 + 4√2 Line of Symmetry 8 6 x=4 4 vertical asymptote 2 x 0 -4 -2 0 4 8 12 16 20 24 -4 y = (1 + √2) x + 6 - 4√2 -6 Line of Symmetry -8 -10

Let’s summarise the graph’s characteristics. This is a hyperbola and so there are two distinct branches. 1. Intercepts. The graph intersects the vertical axis at the point (0, −0.25) and the horizontal axis at the point (−1, 0). 2. There are two turning points — (−1, 0) is a maximum turning point and (9, 18.125) is a minimum turning point.

dy 25 = 1− . Setting this equal to zero, we see that dx (x − 4)2 there are two stationary points: x = −1, 9. These were found by computing

Page 969, Table of Contents

www.EconsPhDTutor.com

d2 y 50 The second derivative is = , which when evaluated at x = −1 is negative and dx2 (x − 4)3 when evaluated at x = 9 is positive. And so the first stationary point is a maximum turning point and the second is a minimum turning point.

By observation, y can take on any value except those between these two turning points. The range of y is thus (−∞, 0] ∪ [18.125, ∞).

3. Asymptotes. As x → 4, y → ±∞. Hence, there is one vertical asymptote: x = 4. As x → ±∞, y → x+6. Hence, there is one oblique asymptote: y = x+6. The two asymptotes are not perpendicular and so this is not a rectangular hyperbola. 4. The centre (point at which the two asymptotes intersect) is (4, 10).

5. We know that the two lines of symmetry bisect the angles formed by the asymptotes and pass through the centre. Again, you don’t√need to know √ how to find their √ equations. √ Nonetheless, I’ll tell you that they are y = (1 + 2) x+6−4 2 and y = (1 − 2) x+6+4 2.

Page 970, Table of Contents

www.EconsPhDTutor.com

−x2 + x − 1 Answer to Exercise 78 (b). Consider the equation y = . By long division, x+1 −x2 + x − 1 3 = −x + 2 − . x+1 x+1

13

y=

(-x2

+ x - 1) / (x + 1)

y

11 9 7 y=-x+2 Oblique Asymptote

5

Minimum Turning Point

y = (-1 + √2) x + 2 - √2 Line of Symmetry

3 1

-6

-5

-4

-3

(-1, 3) Centre

x

-2 -1 -1 0 1 2 3 4 x = -1 vertical -3 asymptote y = (- 1 - √2) x + 2 + √2 Line of Symmetry -5 Maximum -7 Turning Point

This is a hyperbola and so there are two distinct branches. 1. Intercepts. The graph intersects the vertical axis at the point (0, −1), but not the horizontal axis because −x2 + x − 1 has no real zeros. √ 2. There are two turning points — (−1 − 3, 6.464) is a maximum turning point and √ (−1 + 3, −0.464) is a minimum turning point. dy 3 = −1 + . Setting this equal to zero, we see that 2 dx √(x + 1) there are two stationary points: x = −1 ± 3.

These were found by computing

Page 971, Table of Contents

www.EconsPhDTutor.com

√ d2 y 6 The second derivative is 2 = , which when evaluated at x = −1− 3 is positive and dx √ (x + 1)3 when evaluated at x = −1 + 3 is negative. And so the first stationary point is a minimum turning point and the second is a maximum turning point.

By observation, y can take on any value except those between these two turning points. The range of y is thus (−∞, −0.464] ∪ [6.464, ∞).

3. Asymptotes. As x → −1, y → ±∞. Hence, there is one vertical asymptote: x = −1. As x → ±∞, y → −x + 2. Hence, there is one oblique asymptote: y = −x + 2. The two asymptotes are not perpendicular and so this is not a rectangular hyperbola. 4. The centre (point at which the two asymptotes intersect) is (−1, 3).

5. We know that the two lines of symmetry bisect the angles formed by the asymptotes and pass through the centre. Again, you don’t need √ how to find their √ equations. √ √ to know Nonetheless, I’ll tell you that they are y = (−1+ 2)x+2− 2 and y = (−1− 2)x+2+ 2.

Page 972, Table of Contents

www.EconsPhDTutor.com

2x2 − 2x − 1 Answer to Exercise 78 (c). Consider the equation y = . By long division, x+4 2x2 − 2x − 1 39 = 2x − 10 + . x+4 x+4

22

y = (2x2 - 2x - 1) / (x + 4)

-14

-12

-10

-8

-6

y = (2 - √5) x - 10 + 4√5 Line of Symmetry

14 y = (2 + √5) x - 10 + 4√5 Line of Symmetry 6 -4

-2 -2 0 -10 -18

y = 2x + 10 Oblique Asymptote

y

-26 x = -4 vertical -34 asymptote

2

4

6 x

Minimum Turning Point

(-1, -18) Centre

-42 Maximum Turning Point

-50 -58

This is a hyperbola and so there are two distinct branches. 1. Intercepts. The graph intersects the vertical axis at the point (0, −0.25) and the hori√ √ √ zontal axis at the points (0.5(1 − 3), 0) and (0.5(1 + 3), 0), because 0.5(1 ± 3) are the zeros of 2x2 − 2x − 1. √ 2. There are two turning points — (−4 − 39/2, −35.664) is a maximum turning point √ and (−4 + 39/2, −0.336) is a minimum turning point. dy 39 = 2− . Setting this equal to zero, we see that 2 dx (x + 4) √ there are two stationary points: x = −4 ± 39/2. These were found by computing

Page 973, Table of Contents

www.EconsPhDTutor.com

√ d2 y 78 The second derivative is 2 = , which when evaluated at x = −4− 39/2 is negative dx (x√ + 4)3 and when evaluated at x = −4 + 39/2 is positive. And so the first stationary point is a maximum turning point and the second is a minimum turning point.

By observation, y can take on any value except those between these two turning points. The range of y is thus (−∞, −35.664] ∪ [−0.336, ∞).

3. Asymptotes. As x → −4, y → ±∞. Hence, there is one vertical asymptote: x = −4. As x → ±∞, y → 2x − 10. Hence, there is one oblique asymptote: y = 2x − 10. The two asymptotes are not perpendicular and so this is not a rectangular hyperbola. 4. The centre (point at which the two asymptotes intersect) is (−4, −18).

5. We know that the two lines of symmetry bisect the angles formed by the asymptotes and pass through the centre. Again, you don’t need √ to find their equations. √ √ to know how 5)x − 10 + 4 5 and y = (2 − 4 5)x − Nonetheless, I’ll tell you that they are y = (2 + 4 √ 10 − 4 5.

Page 974, Table of Contents

www.EconsPhDTutor.com

86.14

Answers to Exercises in Ch. 17. Simple Parametric Equations

Answer to Exercise 79. (a) Note that sin 0 = 0 and cos 0 = 1. And so at time t = 0, the particle P is at position (1, 0) and in contrast, the particle Q is at position (0, 1). (b) The particle P travels anti-clockwise and in contrast, the particle Q travels clockwise. dx d2 x Answer to Exercise 80. x = a cos t Ô⇒ = −a sin t, 2 = −a cos t. y = b sin t Ô⇒ dt dt dy d2 y = b cos t, 2 = −b sin t. dt dt √ √ π π 2 2 π ,b ). (a) At time t = , the particle’s position is (a cos , b sin ) = (a 4 4 4 2 2 √ √ 2 2 π π In the x-direction, its velocity is −a sin = −a and its acceleration is −a cos = −a . 4 2 4 2 √ 2 This means that it is moving leftwards at a velocity of a ms−1 . Moreover, its acceleration 2 √ 2 leftwards is a ms−2 . 2 √ √ π π 2 2 and its acceleration is −b sin = −b . This In the y-direction, its velocity is b cos = b 4 2 4 2 √ 2 means that it is moving upwards at a velocity of b ms−1 . Moreover, it is decelerating at 2 √ 2 ms−2 . a rate of b 2 π π π (b) At time t = , the particle’s position is (a cos , b sin ) = (0, b). 2 2 2 π π In the x-direction, its velocity is −a sin = −a and its acceleration is −a cos = 0. This 2 2 means that it is moving leftwards at a velocity of a ms−1 . Moreover, its acceleration is 0 ms−2 . π π In the y-direction, its velocity is b cos = 0 and its acceleration is −b sin = −b. This means 2 2 that it is moving upwards at a velocity of 0 ms−1 . (Or equivalently, it is moving downwards at a velocity of 0 ms−1 .) Moreover, it is accelerating in the downwards direction at a rate of b ms−2 . (c) At time t = 2π, the particle’s position is (a cos(2π), b sin(2π)) = (a, 0).

In the x-direction, its velocity is −a sin(2π) = 0 and its acceleration is −a cos(2π) = −a. This means that it is moving leftwards at a velocity of 0 ms−1 . Moreover, it is accelerating in the leftwards direction at a rate of a ms−2 . In the y-direction, its velocity is b cos(2π) = b and its acceleration is −b sin(2π) = 0. This means that it is moving upwards at a velocity of b ms−1 . Moreover, it is not accelerating in the y- direction.

Page 975, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 81. (a) The set {(x, y) ∶ x = tan t, y = sec t, t ∈ R} is equal to the set {(x, y) ∶ y 2 − x2 = 1}. (b)

dx = sec2 t = y 2 is always positive and so the particle is always moving rightwards. dt

(c) We know that at t = 0, we have x = tan t = 0 and y = sec t = 1, and so the particle must be at position B.

π At t = 1, t ∈ [0, ), we have x = tan t ≥ 0 and y = sec t > 0, and so the particle is in the 2 top-right quadrant. The particle must thus be at position C.

π At both t = 2 and 3, t ∈ ( , π], we have x = tan t ≤ 0 and y = sec t < 0, and so the particle 2 is in the bottom-left quadrant. We know moreover that tan 3 > tan 2, so that the particle is further to the right at time t = 3 than at time t = 2. So at time t = 2, the particle is at position D; and at time t = 3, the particle is at position E.

3π ], we have x = tan t ≥ 0 and y = sec t < 0, and so the particle is in the 2 bottom-right. The particle must thus be at position F . At t = 4, t ∈ (π,

Finally, at t = 5, the particle must be at position A.

Page 976, Table of Contents

www.EconsPhDTutor.com

y x+1 = cos2 t and = sin t. Since cos2 t + sin2 t = 1 3 2 y x+1 2 for all t, the set can be rewritten as {(x, y) ∶ + ( ) = 1} or even more simply as 3 2 x+1 2 ) = 3 − 0.75(x + 1)2 }. {(x, y) ∶ y = 3 − 3 ( 2

Answer to Exercise 82 (a). Write

When t = 0, x = −1 and y = 3.

dx dy Moreover, we can compute vx = = 2 cos t and vy = = 6 cos t(− sin t). And so when dt dt t = 0, vx = 2 and vy = 0 meaning the particle is moving to the right.

5

y

4 Instantaneous Direction of Travel 3 y = 3 - 0.75 (x + 1)2 2 1

x

0 -5

-4

-3

-2

-1

-1

0

1

2

3

-2 t = 0, x = - 1, y = 3 vx = 2 cos (t) ms-1 = 2-3 ms-1 vy = - 6 sin(t) cos(t) ms-1 = 0 ms-1 -4 -5

Page 977, Table of Contents

www.EconsPhDTutor.com

1 2 1 Answer to Exercise 82 (b). Write t = 1+ and y = (1 + ) +1. The set can be rewritten x x 2 1 as {(x, y) ∶ y = (1 + ) + 1}. x

When t = 0, x = −1 and y = 1.

−1 dy dx = and vy = = 2t. And so when t = 0, vx = −1 2 dt (t − 1) dt and vy = 0 meaning the particle is moving to the left. Moreover, we can compute vx =

This particle actually travels in rather strange ways. As t goes from 0 to 1, the particle hurtles leftwards towards x = −∞. Then as t goes from just under 1 second to just over 1 second, it magically reappears on the extreme right, at x = ∞. Thereafter it continues to travel leftwards again, towards the vertical axis.

10

y

9 8 7 6 5 4 3 2 1 x

0 -10

-8

Page 978, Table of Contents

-5

-3

0

2

5

7

10

www.EconsPhDTutor.com

Answer to Exercise 82 (c). Write t = x + 1. Then the set can be rewritten as {(x, y) ∶ y = ln(2x + 3)}. When t = 0, x = −1 and y = ln 1 = 0.

dx dy 2 Moreover, we can compute vx = = 1 and vy = = . And so when t = 0, vx = 1 and dt dt 2t + 1 vy = 2 meaning the particle is moving to the northeast.

3

y y = ln (2x + 3)

2 Instantaneous Direction of Travel 1

t = 0, x = - 1, y = 0 vx = 1 ms-1 vy = 2 ms-1 x

0 -2

-1

0

1

2

3

4

5

6

7

8

-1

-2

Page 979, Table of Contents

www.EconsPhDTutor.com

86.15

Answers to Exercises in Ch. 18: Equations and Inequalities

Answer to Exercise 83(a)

2x + 1 > 0 ⇐⇒ one of the following is true: 3x + 2

1. “2x + 1 > 0 AND 3x + 2 > 0” ⇐⇒ “x > −1/2 AND x > −2/3” ⇐⇒ “x > −1/2”; OR 2. “2x + 1 < 0 AND 3x + 2 < 0” ⇐⇒ “x < −1/2 AND x < −2/3” ⇐⇒ “x < −2/3”. Altogether then,

2x + 1 > 0 ⇐⇒ “x < −2/3 OR x > −1/2”. 3x + 2

Or in set notation,

(b)

2x + 1 > 0 ⇐⇒ “x ∈ (−2/3, −1/2)”. 3x + 2

x−1 > 0 ⇐⇒ one of the following is true: −4

1. “x − 1 > 0 AND −4 > 0”, but the latter inequality is impossible; OR

2. “x − 1 < 0 AND −4 < 0” ⇐⇒ x < 1.

Altogether then,

(c)

x−1 > 0 ⇐⇒ x < 1. −4

−1 > 0 ⇐⇒ one of the following is true: −4

1. “−1 > 0 AND −4 > 0”, both of which are impossible; OR

2. “−1 < 0 AND −4 < 0”, both of which are always true.

Altogether then,

(d)

−1 > 0 is always true and, in particular, true for any value of x. −4

1 > 0 ⇐⇒ one of the following is true: −4

1. “1 > 0 AND −4 > 0”, the latter of which is impossible; OR 2. “1 < 0 AND −4 < 0”, the former of which is impossible. Altogether then,

1 > 0 is always false and, in particular, false for every value of x. −4

Page 980, Table of Contents

www.EconsPhDTutor.com

(e)

−3x − 18 > 0 ⇐⇒ one of the following is true: 9x − 14

1. “−3x − 18 > 0 AND 9x + 14 > 0” ⇐⇒ “x < −6 and x > −14/9”, but these two inequalities are mutually contradictory, so together they are impossible; OR

2. “−3x − 18 < 0 AND 9x + 14 < 0” ⇐⇒ “x > −6 and x < −14/9” ⇐⇒ “x ∈ (−6, −14/9)”. Altogether then,

(f)

−3x − 18 > 0 ⇐⇒ “x ∈ (−6, −14/9)”. 9x − 14

2x + 3 < 9 ⇐⇒ −x + 7

9−

⇐⇒ 9 +

⇐⇒ ⇐⇒

2x + 3 −x + 7 2x + 3 x−7

>0

>0

9x − 63 + 2x + 3 >0 x−7 11x − 60 x−7

This last inequality ⇐⇒ one of the following is true:

> 0.

60 and x > 7” ⇐⇒ “x > 7”; OR 11 60 60 2. “11x − 60 < 0 AND x − 7 < 0” ⇐⇒ “x < and x < 7” ⇐⇒ “x < ”. 11 11 1. “11x − 60 > 0 AND x − 7 > 0” ⇐⇒ “x >

Altogether then,

2x + 3 60 > 0 ⇐⇒ “x > 7 OR x < ”. −x + 7 11

Or in set notation,

2x + 3 60 > 0 ⇐⇒ “x ∈ (−∞, ) ∪ (7, ∞)”. −x + 7 11

Page 981, Table of Contents

www.EconsPhDTutor.com

x2 + 2x + 1 Answer to Exercise 84 (a). When is 2 > 0? x − 3x + 2

The numerator x2 + 2x + 1 is a ∪-shaped quadratic with one real zero given by −

b = −1 2a (i.e. it just touches the horizontal axis at x = −1). Hence, x2 + 2x + 1 > 0 ⇐⇒ x ≠ −1. Also, it is never the case that x2 + 2x + 1 < 0.

The denominator x2 − 3x + 2 is a ∪-shaped quadratic with two real zeros given by 1, 2. Hence, x2 − 3x + 2 > 0 ⇐⇒ “x < 1 OR x > 2”. Since it is never the case that the numerator is negative, we don’t need to bother checking when the denominator is negative. Altogether then,

(−1, 1) ∪ (2, ∞)”.

x2 + 2x + 1 > 0 ⇐⇒ “x ≠ −1 AND x < 1 OR x > 2” ⇐⇒ “x ∈ (−∞, −1) ∪ x2 − 3x + 2

Page 982, Table of Contents

www.EconsPhDTutor.com

x2 − 1 Answer to Exercise 84 (b). When is 2 > 0? x −4

The numerator x2 − 1 is a ∪-shaped quadratic with two real zeros given by −1, 1. Hence, x2 − 1 > 0 ⇐⇒ “x < −1 or x > 1”. Also, x2 − 1 < 0 ⇐⇒ “x ∈ (−1, 1)”.

The denominator x2 − 4 is a ∪-shaped quadratic with two real zeros given by −2, 2. Hence, x2 − 4 > 0 ⇐⇒ “x < −2 or x > 2”. Also, x2 − 4 < 0 ⇐⇒ “x ∈ (−2, 2)”. So

x2 − 1 > 0 ⇐⇒ x2 − 4

1. “x < −1 OR x > 1” AND “x < −2 or x > 2” ⇐⇒ “x < −2 or x > 2” ; OR

2. “x ∈ (−1, 1) AND x ∈ (−2, 2)” ⇐⇒ “x ∈ (−1, 1)”. Altogether then, (−1, 1) ∪ (2, ∞).

x2 − 1 > 0 ⇐⇒ “x < −2 or x > 2” OR “x ∈ (−1, 1)” ⇐⇒ x ∈ (−∞, −2) ∪ x2 − 4

Page 983, Table of Contents

www.EconsPhDTutor.com

x2 − 3x − 18 Answer to Exercise 84 (c). When is > 0? −x2 + 9x − 14

The numerator x2 − 3x − 18 has a ∪-shaped graph with two real zeros given by 6, −3. Hence, x2 − 3x − 18 > 0 ⇐⇒ “x < −3 OR x > 6”. Also, x2 − 3x − 18 < 0 ⇐⇒ “x ∈ (−3, 6)”.

The denominator −x2 + 9x − 14 has a ∩-shaped graph and has two real zeros given by 2, 7. Hence, −x2 + 9x − 14 > 0 ⇐⇒ “x < 2 OR x > 7”. Also, −x2 + 9x − 14 < 0 ⇐⇒ “x ∈ (2, 7)”.

So

x2 − 3x − 18 > 0 ⇐⇒ −x2 + 9x − 14

1. “x < −3 OR x > 6” AND “x < 2 OR x > 7” ⇐⇒ ”x < −3 or x > 7”. 2. “x ∈ (−3, 6)” AND “x ∈ (2, 7)” ⇐⇒ “x ∈ (2, 6)”.

Altogether then,

x2 − 3x − 18 > 0 ⇐⇒ x ∈ (−∞, −3) ∪ (2, 6) ∪ (7, ∞). −x2 + 9x − 14

Page 984, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 84 (d). Let’s rearrange the inequality: x + 1 −x + 2 x + 1 −x + 2 (x + 1)(2x − 1) + (x − 2)(−x + 4) > Ô⇒ − > 0 Ô⇒ > 0. −x + 4 2x − 1 −x + 4 2x − 1 (−x + 4)(2x − 1)

(x + 1)(2x − 1) + (x − 2)(−x + 4) 2x2 + x − 1 − x2 + 6x − 8 x2 + 7x − 9 But = = . (−x + 4)(2x − 1) (−x + 4)(2x − 1) (−x + 4)(2x − 1)

x2 + 7x − 9 x + 1 −x + 2 > ⇐⇒ > 0. When is this latter inequality true? Hence, −x + 4 2x − 1 (−x + 4)(2x − 1) The numerator x2 + 7x − 9 is a ∪-shaped quadratic with zeros √ −7 + 85 a= 2

√ −7 − 85 and b = . 2

Hence, the numerator is negative for x ∈ (a, b) and positive for x ∈ (−∞, a) ∪ (b, ∞).

The denominator (−x + 4)(2x − 1) is a ∩-shaped quadratic with zeros 0.5 and 4 as zeros. Hence, the denominator is positive for x ∈ (0.5, 4) and negative for x ∈ (−∞, 0.5) ∪ (4, ∞). So„

x2 + 7x − 9 > 0 ⇐⇒ (−x + 4)(2x − 1)

1. “x ∈ (a, b)” AND “x ∈ (−∞, 0.5) ∪ (4, ∞)” ⇐⇒ x ∈ (0.5(−7+

√

85) , 4); OR

2. “x ∈ (−∞, a) ∪ (b, ∞)” AND “x ∈ (−∞, 0.5) ∪ (4, ∞)” ⇐⇒ x ∈ (0.5(−7−

√ 85) , 0.5).

Altogether then,

x2 + 7x − 9 >0 (−x + 4)(2x − 1)

Page 985, Table of Contents

⇐⇒

x ∈ (0.5 (−7 −

√ √ 85) , 0.5) ∪ (0.5 (−7 + 85) , 4) .

www.EconsPhDTutor.com

Answer to Exercise 85(a) Rewrite the inequality as x3 − x2 + x − 1 − ex > 0. Graph y = x3 − x2 + x − 1 − ex on your TI84. After graphing.

Left x-intercept. Right x-intercept.

So x3 − x2 + x − 1 − ex = 0 ⇐⇒ x ≈ 3.0, 3.5. Thus, x3 − x2 + x − 1 > ex ⇐⇒ x ∈ (3.0, 3.5). (b) Rewrite the inequality as

√ √ x − cos x > 0. Graph y = x − cos x on your TI84.

After graphing.

So

Zoom in.

The only x-intercept.

√ √ x − cos x = 0 ⇐⇒ x ≈ 0.6. Thus, x > cos x ⇐⇒ x > 0.6.

(c) Rewrite the inequality as 1/ (1 − x2 ) − x3 − sin x > 0. Graph y = 1/ (1 − x2 ) − x3 − sin x on your TI84. After graphing.

Zoom in.

The only x-intercept.

Examining the graph from right to left, we observe that 1/ (1 − x2 ) − x3 − sin x is negative for x > 1 and positive for x ∈ (−1, 1). For x < −1, the expression is positive to the left of what appears to be the only horizontal intercept. We find x3 − x2 + x − 1 − ex = 0 ⇐⇒ x ≈ −1.2. Thus,

1 − x3 − sin x > 0 ⇐⇒ x ∈ (−∞, −1.2) ∪ (−1, 1). 2 1−x

Page 986, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 86. Let A, B, and C be the present-day age of Apu, Beng, and Caleb. Let k be the number of years ago when Apu was 40 years old. From the first sentence, we 1 2 3 4 know that A − k = 40 and B − k = 2(C − k). From the second sentence: A = 2B and C = 28.

Sub = into = and = into = to get 2B − k = 40 and B − k = 2(28 − k). 3

1

4

2

5

6

From =, k = 2B − 40. Sub = into = to get: 5

7

7

6

B − (2B − 40) = 2 [28 − (2B − 40)] 40 − B = 2 [68 − 2B] = 136 − 4B 3B = 96 Ô⇒ B = 32

Beng is 32 years old today. And from =, Apu is 64 years old today. 3

Answer to Exercise 87. At 3pm, Plane A is 300 km northeast of the starting point and Plane B is 600 km south of it. The angle formed by their flight paths is 3π/4. The distance between the two planes is the third side of the triangle, two of whose sides are 300 km and 600 km, and whose angle between those two sides is 3π/4. By the Law of Cosines (Proposition 260) from O-Level, the third √ side of a triangle is given 2 by: c2 = a2 +b2 −2ab cos C = 90000+360000−2(300)(600)×(− ) ≈ 195442. Hence, c ≈ 442. 2

Thus, at 3pm, the two planes are 442 km apart. From 3pm, the distance between the two planes is shrinking by 300 km/h. Hence, it will be another 442/300 hours, or about 1h 28m before they collide. Hence, they will collide at around 4:28pm.

Page 987, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 88. The given information provides this system of equations a (1) + b (1) + c =2, 1

2

a (3) + b (3) + c =5, 2

2

a (6) + b (6) + c =9. 3

2

You can solve this system of equations either by calculator or by hand, as I do now: Take = minus = to get 8a + 2b = 3 or b = 0.5(3 − 8a) = 1.5 − 4a. 2

1

4

Plug = into = to get a + 1.5 − 4a + c = 2 or c = 0.5 + 3a. 4

1

5

Plug = and = into = to get 4

5

3

36a + 6 (1.5 − 4a) + 0.5 + 3a =9 ⇐⇒ 15a + 9.5 = 9 ⇐⇒ 15a = −0.5 ⇐⇒ a = −1/30.

Now from =, b = 49/30 and from =, c = 0.4. 4

5

Answer to Exercise 89. The turning point (which is a minimum turning point if a is b positive) of the equation is at x = − and 2a b b2 b2 b2 b 2 − +c=c− . y = a (− ) + b (− ) + c = 2a 2a 4a 2a 4a

We know that at the minimum point, x = 0 and y = 0. So b = 0 and c = 0. Since (−1, 2) satisfies the equation y = ax2 + bx + c, we also have a (−1) + b (1) + c =2, a =2. 2

1

Altogether then, a = 2, b = 0, and c = 0.

Page 988, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 90 (a). The system of equations is x2 + y 2 = 1, y = sin x. √ √ Rewrite the first equation into two equations y = 1 − x2 and y = − 1 − x2 .

Now combine √ these with the second √ of the two given equations to form two new equations: y = sin x − 1 − x2 and y = sin x + 1 − x2 .

Our goal is to find the horizontal intercepts of each of these equations. These horizontal intercepts will give us the solutions to the above system of equations. 1. Graph the equation y = sin x −

√

1 − x2 .

It looks like there is only one horizontal intercept. 2. Find the horizontal intercept using the “zero” option. The horizontal intercept is 0.7391. Now repeat the above, but for the second equation: 3. Graph the equation y = sin x +

√

1 − x2 .

It looks like there is only one horizontal intercept. 4. Find the horizontal intercept using the “zero” option. The horizontal intercept is −0.7391.

Conclusion: This system of equations has two solutions and their x-coordinates are −0.7391 and 0.7391. To find the corresponding y-coordinates, we need merely plug in these values of x into either of the equations in the original system of equations: y = sin x = sin (−0.7391) ≈ −0.6736 and y = sin x = sin (0.7391) ≈ 0.6736. Altogether, this system of equations has two solutions: (−0.7391, −0.6736) and (0.7391, 0.6736). After Step 1.

Page 989, Table of Contents

After Step 2.

After Step 3.

After Step 4.

www.EconsPhDTutor.com

Answer to Exercise 90 (b). The system of equations is y =

1 √ , y = x5 − x3 + 2. 1+ x

Rewrite the two equations into a new equation y = x5 − x3 + 2 −

1 √ . 1+ x

Our goal is to find the horizontal intercepts of this equation. These horizontal intercepts will give us the solutions to the above system of equations. 1. Graph the equation y = x5 − x3 + 2 −

1 √ . 1+ x

It looks like there are no horizontal intercepts. Conclusion: This system of equations has no solutions. After Step 1.

Page 990, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 90 (c). The system of equations is y = Rewrite the two equations into a new equation y =

1 , y = x3 + sin x. 2 1−x

1 − x3 − sin x. 2 1−x

Our goal is to find the horizontal intercepts of this equation. These horizontal intercepts will give us the solutions to the above system of equations. 1. Graph the equation y =

1 − x3 − sin x. 2 1−x

It looks like there is only one horizontal intercept. 2. Find the horizontal intercept. It is −1.1790. Conclusion: This system of equations has one solution and its x-coordinate is −1.1790. To find the corresponding y-coordinate, we need merely plug in this value of x into 1 1 either of the equations in the original system of equations: y = = ≈ 1 − x2 1 − (−1.1790)2 −2.5633. Altogether, this system of equations has one solutions: (−1.1790, −2.5633). After Step 1.

Page 991, Table of Contents

After Step 2.

www.EconsPhDTutor.com

87

Answers to Exercises in Part II: Sequences and Series 87.1

Answers for Ch. 19: Finite Sequences

Answer to Exercise 91. (a) A corresponding function for the finite sequence (1, 4, 9, 16 , 25 , 36 , 49, 64, 81, 100) is a function f with • Domain {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}; • Codomain R; and

• Mapping rule f (n) = n2 for all n.

(b) A corresponding function for the finite sequence (2, 5, 8, 11, 14, 17, 20) is a function f with • Domain {1, 2, 3, 4, 5, 6, 7}; • Codomain R; and

• Mapping rule f (n) = 3n − 1 for all n.

(c) A corresponding function for the finite sequence (0.5, 4, 13.5, 32, 62.5, 108, 171.5) is a function f with

• Domain {1, 2, 3, 4, 5, 6, 7}; • Codomain R; and

• Mapping rule f (n) =

n3 for all n. 2

(d) A corresponding function for the finite sequence (2, 6, 6, 12, 10, 18, 14, 24, 18, 30, 22, 36, 26 , 42) is a function f with

• Domain {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14};

• Codomain R; and • Mapping rule f (n) = 2n for all odd n and f (n) = 3n for all even n.

(e) There is no obvious pattern here. So a corresponding function for the finite sequence (18, 14.5) is a (trivial) function f with • Domain {1, 2}; • Codomain R; and

• Mapping rule f (1) = 18 and f (2) = 14.5. Page 992, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 92. (a) A corresponding function for the finite sequence (3, 4, 9, 64, 3969) is the function f with • Domain {1, 2, 3, 4, 5}; • Codomain is R; and

• Mapping rule is f (1) = 3 and f (n) = [f (n − 1) − 1] (the recurrence relation) for all n ≥ 2. 2

(b) A corresponding function for the finite sequence (1, 2, 10, 290, 252010) is the function f with • Domain {1, 2, 3, 4, 5}; • Codomain is R; and

• Mapping rule is f (1) = 1 and f (n) = 3 [f (n − 1)] − f (n − 1) (the recurrence relation) for all n ≥ 2.

Page 993, Table of Contents

2

www.EconsPhDTutor.com

87.2

Answers for Ch. 20: Infinite Sequences

Answer to Exercise 93. (a) A corresponding function for the infinite sequence (1, 4, 9, 16, 25, 36, 49, 64, 81, 100, . . . ) is a function f with • Domain Z+ ;

• Codomain R; and • Mapping rule f (n) = n2 for all n.

(b) A corresponding function for the infinite sequence (2, 5, 8, 11, 14, 17, 20, . . . ) is a function f with • Domain Z+ ; • Codomain R; and

• Mapping rule f (n) = 3n − 1 for all n.

(c) A corresponding function for the infinite sequence (0.5, 4, 13.5, 32, 62.5, 108, 171.5, . . . ) is a function f with • Domain Z+ ;

• Codomain R; and • Mapping rule f (n) =

n3 for all n. 2

(d) A corresponding function for the infinite sequence (2, 6, 6, 12, 10, 18, 14, 24, 18, 30, 22, 36, 26, 42, . . . ) is a function f with • Domain Z+ ;

• Codomain R; and • Mapping rule f (n) = 2n for all odd n and f (n) = 3n for all even n.

Page 994, Table of Contents

www.EconsPhDTutor.com

87.3

Answers for Ch. 22: Summation 10

Answer to Exercise 94. (a) 1 + 4 + 9 + 16 + 25 + 36 + 49 + 64 + 81 + 100 = ∑ n2 . n=1

8

(b) 2 + 5 + 8 + 11 + 14 + 17 + 20 + 23= ∑ (3n − 1). n=1

7

n3 (c) 0.5 + 4 + 13.5 + 32 + 62.5 + 108 + 171.5= ∑ . n=1 2 Answer to Exercise 95. (a) 5

∑ (2 − n)

n

n=−2

−2

= [2 − (−2)]

−1

+ [2 − (−1)]

+ [2 − 0] + [2 − 1] 0

+ [2 − 2] + [2 − 3] + [2 − 4] + [2 − 5] 2

3

4

1

5

= 4−2 + 3−1 + 20 + 11 + 02 + (−1)3 + (−2)4 + (−3)5 = 1/16 + 1/3 + 1 + 1 + 0 + (−1) + 16 + (−243) = −22529/48.

17

(b) ∑ (4n + 5)= (4 × 16 + 5) + (4 × 17 + 5) = 142. n=16

(c) Remember: we can choose any name (letter) we like for the index or dummy variable. We’ve usually been using n. But here I chose to use the letter x instead. This makes no difference. 33

∑ (x − 3) = (31 − 3) + (32 − 3) + (33 − 3)= 28 + 29 + 30 = 87.

x=31

Page 995, Table of Contents

www.EconsPhDTutor.com

87.4

Answers for Ch. 23: Arithmetic Sequences and Series

Answer to Exercise 96. (a) The common difference in the arithmetic series 2 + 7 + 12 + 199

17 + 22 + 27 + 32 + ⋅ ⋅ ⋅ + 997 = ∑ (2 + 5n) is 5. There are in total 200 terms. By Fact 13, its n=0

200 sum of series is (2 + 997) × = 99900. 2

100

(b) The common difference in the arithmetic series 3+20+37+54+71+⋅ ⋅ ⋅+1703 = ∑ (3 + 17n) n=0

101 is 17. There are in total 101 terms. By Fact 13, its sum of series is (3 + 1703) × = 86153. 2

(c) The common difference in the arithmetic series 81 + 89 + 97 + 105 + 113 + ⋅ ⋅ ⋅ + 8081 =

1000

∑ (81 + 8n) is 8. There are in total 1001 terms. By Fact 13, its sum of series is (81 +

n=0

8081) ×

1001 = 4085081. 2

Page 996, Table of Contents

www.EconsPhDTutor.com

87.5

Answers for Ch. 24: Geometric Sequences and Series 6

Answer to Exercise 97. (a) The geometric series 7 + 14 + 28 + 56 + ⋅ ⋅ ⋅ + 448 = ∑ (7 × 2n ) n=0

has common ratio 2. There are in total 8 terms. Thus, the geometric sum of series is 1 − 28 −255 7× =7× = 1785. 1−2 −1 5

(b) The geometric series 20+10+5+⋅ ⋅ ⋅+ 5/8 = ∑ [7 × (1/2) ] has common ratio 1/2. There are n

n=0

1 6 1 63 1 in total 6 terms. Thus, the geometric sum of series is 20 × [1 − ( ) ] / (1 − ) = 20 × / = 2 2 64 2 63 315 40 × = . 64 8 5

1 n 1 (c) The geometric series 1 + + + ⋅ ⋅ ⋅ + = ∑ ( ) has common ratio . There are 3 n=0 3 6 1 728 2 1 / = in total 6 terms. Thus, the geometric sum of series is 1 × [1 − ( ) ] / (1 − ) = 3 3 729 3 3 728 364 × = . 2 729 243 1/3

1/9

1/243

∞ 3 n Answer to Exercise 98. (a) The geometric series 6 + 9/2 + 27/8 + ⋅ ⋅ ⋅ = ∑ [6 × ( ) ] has 4 n=0 6 common ratio 3/4. Thus, its sum is = 24. 1 − 3/4 ∞

1 n 1 (b) The geometric series 20 + 10 + 5 + ⋅ ⋅ ⋅ = ∑ [20 × ( ) ] has common ratio . Thus, its 2 2 n=0 20 sum is = 40. 1 − 1/2 (c) The geometric series

1 sum is 1/ (1 − ) = 3/2. 3

Page 997, Table of Contents

1 + 1/3 + 1/9 + ⋅ ⋅ ⋅ + 1/243

∞

1 n 1 == ∑ ( ) has common ratio . Thus, its 3 n=0 3

www.EconsPhDTutor.com

87.6

Answers for Ch. 25: Proof by Induction

Answer to Exercise 99. Step #1. Let P(k) stand for the proposition that k(k + 1) ] . ∑r = [ 2 r=1 k

2

3

Our goal is to show that P(k) is true for all k = 1, 2, 3, . . .

1(1 + 1) ]. ✓ Step #2. Verify that P(1) is true: ∑ r = 1 = [ 2 r=1 1

3

3

2

Step #3. Show that P(j) implies P(j + 1) (for all j = 1, 2, 3, . . . ). j(j + 1) ]. Assume that P(j) is true. That is, ∑ r = [ 2 r=1 j

2

3

(j + 1) [(j + 1) + 1] ]. Our goal is to show that P(j + 1) is true. That is, ∑ r = [ 2 r=1 j+1

2

3

To this end, write j(j + 1) ] + (j + 1)3 ∑ r = ∑ r + (j + 1) = [ 2 r=1 r=1

j+1

=

3

j

3

3

2

(j + 1)2 2 (j + 1)2 2 (j + 1)2 [j + 4(j + 1)] = (j + 4j + 4) = (j + 2)2 4 4 4

(j + 1)(j + 2) (j + 1) [(j + 1) + 1] =[ ] =[ ] , 2 2

Page 998, Table of Contents

2

2

as desired.

www.EconsPhDTutor.com

Answer to Exercise 100. Step #1. Let P(k) stand for the proposition that 1 − (k + 1)ak + kak+1 . ∑ ra = a (1 − a)2 r=1 k

r

Our goal is to show that P(k) is true for all k = 1, 2, 3, . . . Step #2. Verify that P(1) is true. 1

(1 − a)2 =a (1 − a)2 1 − (1 + 1)a1 + 1 × a1+1 =a (1 − a)2

∑ ra = a r

r=1

1 − 2a + a2 (1 − a)2 1 − (k + 1)ak + kak+1 =a .✓ (1 − a)2

=a

Step #3. Show that P(j) implies P(j + 1) (for all j = 1, 2, 3, . . . ). Assume that P(j) is true. That is,

1 − (j + 1)aj + jaj+1 . ∑ ra = a (1 − a)2 r=1 j

r

Our goal is to show that P(j + 1) is true. That is, j+1

∑ rar = a

r=1

To this end, write j+1

1 − (j + 2)aj+1 + (j + 1)aj+2 . (1 − a)2

1 − (j + 1)aj + jaj+1 + (j + 1)aj+1 2 (1 − a) r=1 1 − (j + 1)aj + jaj+1 + (j + 1)aj (1 − a)2 =a (1 − a)2 1 − (j + 1)aj + jaj+1 + (j + 1)aj (1 − 2a + a2 ) =a (1 − a)2 1−(j + 1)aj + jaj+1 +jaj −2jaj+1 + jaj+2 +aj −2aj+1 + aj+2 =a (1 − a)2 1−(j + 2)aj+1 + jaj+2 + aj+2 1 − (j + 2)aj+1 + (j + 1)aj+2 =a = a , (1 − a)2 (1 − a)2 j

∑ ra = ∑ rar + (j + 1)aj+1 = a

r=1

as desired.

r

Page 999, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 101. Step #1. Let P(k) stand for the proposition that k

∑ r4 =

r=1

k(k + 1)(2k + 1)(3k 2 + 3k − 1) . 30

Our goal is to show that P(k) is true for all k = 1, 2, 3, . . . Step #2. Verify that P(1) is true. 1

∑ r 4 = 14 =

r=1

1(1 + 1)(2 × 1 + 1)(3 × 12 + 3 × 1 − 1) . 30

Step #3. Show that P(j) implies P(j + 1) (for all j = 1, 2, 3, . . . ).

✓

Assume that P(j) is true. That is, j

∑ r4 =

r=1

j(j + 1)(2j + 1)(3j 2 + 3j − 1) . 30

Our goal is to show that P(j + 1) is true. That is, j+1

∑ r4 =

r=1

(j + 1) [(j + 1) + 1] [2(j + 1) + 1] [3(j + 1)2 + 3(j + 1) − 1] . 30

To this end, write j+1

j

∑ r = ∑ r4 + (j + 1)4 =

r=1

4

=

=

=

=

=

= =

=

r=1

j(j + 1)(2j + 1)(3j 2 + 3j − 1) + (j + 1)4 30

j+1 [j(2j + 1)(3j 2 + 3j − 1) + 30(j + 1)3 ] 30 j+1 [(2j 2 + j)(3j 2 + 3j − 1) + 30j 3 + 90j 2 + 90j + 30] 30 j+1 (6j 4 + 6j 3 − 2j 2 + 3j 3 + 3j 2 − j + 30j 3 + 90j 2 + 90j + 30) 30 j+1 (6j 4 + 39j 3 + 91j 2 + 89j + 30) 30 (j + 1) (6j 4 + 18j 3 + 10j 2 + 21j 3 + 63j 2 + 35j + 18j 2 + 54j + 30) 30 (j + 1)(2j 2 + 7j + 6) (3j 2 + 9j + 5) (j + 1)(j + 2)(2j + 3) (3j 2 + 9j + 5) = 30 30 2 (j + 1)(j + 2)(2j + 3) (3j + 6j + 3 + 3j + 3 − 1) , 30 (j + 1) [(j + 1) + 1] [2(j + 1) + 1] [3(j + 1)2 + 3(j + 1) − 1] , as desired. 30

Page 1000, Table of Contents

www.EconsPhDTutor.com

88

Answers to Exercises in Part III: Vectors

88.1

Answers for Ch. 26: Quick Revision

Answer to Exercise 102.

Page 1001, 1001, Table Table of of Contents Contents Page

www.EconsPhDTutor.com www.EconsPhDTutor.com

88.2

Answers for Ch. 27: Vectors in 2D

Ð → Ð → Ð → Answer to Exercise 103. The other 8 vectors are ad = (4, −4), ba = (−4, 3), bc = (−4, 2), → Ð → Ð → Ð → Ð → = (0, 1), Ð ca cb = (4, −2), da = (−4, 4), db = (0, 1), and dc = (−4, 3).

Answer to Exercise 104. (a) If the vector (4, −3) has tail (0, 0), then its head is (0, 0) + (4, −3) = (4, −3). (b) If it has head (0, 0), then its tail is (0, 0) − (4, −3) = (−4, 3). (c) If it has tail (5, 2), then its head is (5, 2) + (4, −3) = (9, −1). (d) If it has head (5, 2), then its tail is (5, 2) − (4, −3) = (1, 5). → Ð → Ð → → Ð → Ð → Ð → Ð → Ð → Ð → Ð → Ð → → →+Ð Answer to Exercise 105. Ð ac cb = ab, dc + Ð ca = da, bd + da = ba, ad − cd = ad + dc = Ð ac, Ð → Ð → Ð → Ð → Ð → Ð → Ð → Ð → Ð → −dc − bd = cd + db = cb, and bd + db = bb = (0, 0). Note that this last vector bb carries us from the point b to the point b; in other words, it carries us nowhere. Hence, it is the zero Ð → vector, which can also be denoted by 0 or 0 .

→ →−Ð ac cb = (0, −1) − (4, −2) = Answer to Exercise 106. These are all vectors. Specifically, Ð Ð → → Ð → Ð → Ð → Ð → (−4, 1), dc− Ð ca = (−4, 3)−(0, 1) = (−4, 2), bd− da = (0, −1)−(−4, 4) = (4, −5), ad+ cd = (4, −4)+ Ð → Ð → Ð → Ð → (4, −3) = (8, −7), dc + bd = (−4, 3) + (0, −1) = (−4, 2), and bd − db = (0, −1) − (0, 1) = (0, −2).

→ →−Ð Answer to Exercise 107. These are all lengths (or magnitudes). Specifically, ∣Ð ac cb∣ = √ √ √ √ Ð → Ð Ð → Ð → 2 2 → 2 ∣(−4, 1)∣ = (−4) + 1 = 17, ∣dc − ca∣ = ∣(−4, 2)∣ = (−4) + 22 = 20, ∣bd − da∣ = ∣(4, −5)∣ = √ √ √ √ Ð → Ð → Ð → Ð → 2 2 2 4 + (−5) = 41, ∣ad + cd∣ = ∣(8, −7)∣ = 82 + (−7) = 113, ∣dc + bd∣ = ∣(−4, 2)∣ = √ √ √ √ Ð → Ð → 2 2 2 (−4) + 2 = 20, and ∣bd − db∣ = ∣(−4, 2)∣ = 02 + (−2) + = 4 = 2. √ √ 2 2 The distance between (18, 4) and (−1, −2) is [18 − (−1)] + [4 − (−2)] = 192 + 62 = √ 397.

Answer to Exercise 108. No, in general∣u + v∣ ≠ ∣u∣ + ∣v∣. For example,√∣(1, 0)∣ = 1 and ∣(0, 1)∣ = 1, so that ∣(1, 0)∣ + ∣(0, 1)∣ = 2 . But ∣(1, 0) + (0, 1)∣ = ∣(1, 1)∣ = 2. Hence, ∣(1, 0)∣ + ∣(0, 1)∣ ≠ ∣(1, 1)∣.

Page 1002, Table of Contents

www.EconsPhDTutor.com

√ Ð → Ð → Answer to Exercise 109. 2ab = 2(4, −3) = (8, −6) and indeed ∣2ab∣ = 82 + (−62 ) = √ √ √ Ð → Ð → Ð → 100 = 10 = 2 ∣ab∣ ✓. 3ac = 3(0, −1) = (0, −3) and indeed ∣3ac∣ = 02 + (−32 ) = 9 = 3 = √ √ → Ð → √ Ð → → ✓. 4Ð 3 ∣Ð ac∣ ad = 4(4, −4) = (16, −16) and indeed ∣4ad∣ = 162 + (−162 ) = 512 = 4 32 = 4 ∣ad∣ ✓.

Ð → Answer to Exercise 110. The unit vector in the direction ab is 1/5(4, −3). That in the → √ → is (0, −1). That in the direction Ð direction Ð ac ad is 1/ 32(4, −4). The unit vectors in the Ð → → Ð → directions 2ab, 3Ð ac, and 4ad are the same. 7 Answer to Exercise 111. (i) Write α + 7β = 0 and 3α + 5β = 1. Solving, we have α = 16 1 7 1 and β = − . Thus, (0, 1) = (1, 3) − (7, 5). 16 16 16

(ii) Write α + 7β = 1 and 3α + 5β = 0. Solving, we have α = −5/16 and β = 3/16. Thus, 3 5 (1, 0) = − (1, 3) + (7, 5). 16 16 1 1 and β = . Thus, (iii) Write α + 7β = 1 and 3α + 5β = 1. Solving, we have α = 8 8 1 1 (1, 1) = (1, 3) + (7, 5). 8 8 Answer to Exercise 112. (a) We have p =

Hence, the point is p = (

21 32 , ). 11 11

6 5 6 5 21 32 a + b = (1, 2) + (3, 4) = ( , ). 11 11 11 11 11 11

1 5 1 5 11 19 11 19 (b) We have p = a + b = (1, 4) + (2, 3) = ( , ). Hence, the point is p = ( , ). 6 6 6 6 6 6 6 6

2 3 2 3 2 3 2 3 (c) We have p = a + b = (−1, 2) + (3, −4) = ( , − ). Hence, the point is p = ( , − ). 5 5 5 5 5 5 5 5

Page 1003, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 113 (a). (2, 0) is a vector that points purely to the right and (0, 17) π is a vector that points purely up. So the angle between them must be . Now verify that 2 the formula gives us this angle: (2, 0) ⋅ (0, 17) ) ∣(2, 0)∣ ∣(0, 17)∣ 2 × 0 + 0 × 17 = cos−1 ( ) ∣(2, 0)∣ ∣(0, 17)∣ π = cos−1 0 = . ✓ 2

θ = cos−1 (

(b) (5, 0) is a vector that points purely to the right and (−3, 0) is a vector that points purely to the left. So the angle between them must be π. Now verify: θ = cos−1 (

(5, 0) ⋅ (−3, 0) ) ∣(5, 0)∣ ∣(−3, 0)∣

⎞ ⎛ 5 × (−3) + 0 × 0 ⎟ = cos ⎜ √ √ ⎝ ∣ 52 + 02 ∣ ∣ (−3)2 + 02 ∣ ⎠ −15 = cos−1 ( ) = cos−1 (−1) = π. ✓ 5×3 −1

Page 1004, Table of Contents

www.EconsPhDTutor.com

Answer√to Exercise 113 (c). Recall that the right-angled triangle whose base is 1 and 3 π π side is has angle between the base and the hypothenuse. Hence, is the angle 3 6 √ 6 3 ). Now verify: between i and (1, 3 ⎛ ⎞ √ √ ⎛ ⎞ (1, 0) ⋅ (1, 3/3) 3/3 1 × 1 + 0 × ⎜ ⎟ ⎟ √ θ = cos−1 = cos−1 ⎜ √ √ √ ⎜ 2 ⎟ ⎝ ∣(1, 0)∣ ∣(1, 3/3)∣ ⎠ 2 2 2 ⎝ ∣ 1 + 0 ∣ ∣ 1 + ( 3/3) ∣ ⎠ = cos−1

⎞ ⎛ 3 π 1 √ = cos−1 ( √ ) = . ✓ 6 ⎝ 1 × 4/3 ⎠ 2

√ π (d) Recall that the right-angled triangle whose base is 1 and side is 3 has angle between 3 √ π the base and the hypothenuse. Hence, is the angle between i and (1, 3). Now verify: 3 ⎛ ⎞ √ √ ⎞ ⎛ 3) 1 × 1 + 0 × 3 (1, 0) ⋅ (1, ⎜ ⎟ ⎟ √ cos−1 ⎜ √ θ == cos−1 √ √ ⎜ 2 ⎟ ⎝ ∣(1, 0)∣ ∣(1, 3)∣ ⎠ 2 2 2 ⎝ ∣ 1 + 0 ∣ ∣ 1 + ( 3) ∣ ⎠ = cos−1 (

1 π 1 √ ) = cos−1 ( ) = . ✓ 2 3 1× 4

y

y

rad i

x

rad i x

Page 1005, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 114. i ⋅ j = (1, 0) ⋅ (0, 1) = 1 × 0 + 0 × 1 = 0 + 0 = 0. Hence, i and j are orthogonal. Answer to Exercise 115. (a) The length of the projection of (1, 0) on (33, 33) is the same as the length of the projection of (1, 0) on (1, 1), which is: ̂ (1, 0) ⋅ (1, 1) = (1, 0) ⋅ [

1 1 (1, 1)] = (1, 0) ⋅ (1, 1) ∣(1, 1)∣ ∣(1, 1)∣ √ 2 2 2 √ 1 = 2. = √ (1 × 1 + 0 × 1) = √ = 2 2 2

(b) The length of the projection of (33, 33) on (1, 0) is ̂ (33, 33) ⋅ (1, 0) = (33, 33) ⋅ [

1 1 (1, 0)] = (33, 33) ⋅ (1, 0) ∣(1, 0)∣ ∣(1, 0)∣

1 = (33 × 1 + 33 × 0) = 33. 1

(1, 3) ⋅ (1, 0) = ∣(1, 3)∣ ∣(1, 0)∣ 1 (1, 3) ⋅ (0, 1) 3 1 3 √ and its y-direction cosine is = √ . Hence, its unit vector is ( √ , √ ). ∣(1, 3)∣ ∣(1, 0)∣ 10 10 10 10

Answer to Exercise 116. (a) Given the vector (1, 3), its x-direction cosine is

(4, 2) ⋅ (1, 0) 4 = √ and its y-direction ∣(4, 2)∣ ∣(0, 1)∣ 20 (4, 2) ⋅ (0, 1) 2 4 2 cosine is = √ . Hence, its unit vector is ( √ , √ ). ∣(4, 2)∣ ∣(0, 1)∣ 20 20 20

(b) Given the vector (4, 2), its x-direction cosine is

(−1, 2) ⋅ (1, 0) 1 = − √ and its y∣(−1, 2)∣ ∣(1, 0)∣ 5 (−1, 2) ⋅ (0, 1) 2 1 2 direction cosine is = √ . Hence, its unit vector is (− √ , √ ). ∣(−1, 2)∣ ∣(0, 1)∣ 5 5 5 (c) Given the vector (−1, 2), its x-direction cosine is

Page 1006, Table of Contents

www.EconsPhDTutor.com

88.3

Answers for Ch. 29: Vectors in 3D

Answer to Exercise 117. (a) A three-dimensional (3D) vector is an “arrow” that has two characteristics: direction and length. Just like a point, it can be described by an ordered triple of real numbers. The vector a = (a1 , a2 , a3 ) carries us from the origin to the point (a1 , a2 , a3 ). ⎛ a1 ⎞ Ð → ⎟ (b) a = (a1 , a2 , a3 ) = ⎜ ⎜ a2 ⎟ = a1 i + a2 j + a3 k = a . If we let a refer to the point (a1 , a2 , a3 ), ⎝ a3 ⎠ → (i.e. the vector that carries us from the then we can also write a as the position vector Ð oa origin to the point a). (c) Given two points a = (a1 , a2 , a3 ) and b = (b1 , b2 , b3 ), (i) there is no such thing as a+b; (ii) → Ð → →+Ð ob is the vector (a1 + b1 , a2 + b2 , a3 + b3 ); oa a + ob is the point (a1 + b1 , a2 + b2 , a3 + b3 ); (iii) Ð → Ð → →−Ð and (iv) Ð oa ba is the vector ob.

Page 1007, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 118. (a) √ √ of the vectors √ The length √ a = (1, 2, 3), b = (4, 5, 6), and 2 2 2 2 2 2 a 3 = 14, ∣b∣ = 4 + 5 + 6 = 77, and ∣a − b∣ = ∣(−3, −3, −3)∣ = √− b are ∣a∣ = 1 + 2 + √ (−3)2 + (−3)2 + (−3)2 = 27.

(b) the vectors 2a = (2, 4, 6), 3b = (12, 15, 18), and 4(a − b) are respectively √ The√lengths of √ 2 14, 3 77, and 4 27. 1 1 ˆ = √1 (4, 5, 6), and â ˆ = √ (1, 2, 3), b (c) The unit vectors are a − b = √ (−3, −3, −3). 14 77 27

(d) (1, 2, 3) ⋅ (4, 5, 6) = 1 × 4 + 2 × 5 + 3 × 6 = 32 and (−2, 4, −6) ⋅ (1, −2, 3) = (−2) × 1 + 4 × (−2) + (−6) × 3 = −28. (e) (i) The angle between the vectors (1, 2, 3) and (4, 5, 6) is

√ √ 32 ≈ 0.226. cos−1 [a ⋅ b/ (∣a∣ ∣b∣)] = cos−1 [32/ ( 14 × 77)] = cos−1 √ 1078

(e) (ii) The angle between the vectors (−2, 4, −6) and (1, −2, 3) is

√ √ cos−1 [u ⋅ v/ (∣u∣ ∣v∣)] = cos−1 [−28/ ( (−2)2 + 42 + (−6)2 × 12 + (−2)2 + 32 )] −28 −28 = cos−1 √ = π. = cos−1 28 56 × 14

No, these two vectors are not orthogonal; instead, they are pointing in the exact opposite directions. (f) The length of the projection of a = (1, 2, 3) on b = (4, 5, 6) is ˆ = (1, 2, 3) ⋅ √1 (4, 5, 6) = √32 . a⋅b 77 77

Page 1008, Table of Contents

www.EconsPhDTutor.com

(g) By the Ratio Theorem, the vector p that divides the line segment ab in the ratio 2 ∶ 3 is given by p=

µ λ a+ b λ+µ λ+µ 3 2 = (1, 2, 3) + (4, 5, 6) 2+3 2+3 1 = (11, 16, 21). 5

1 Hence, the point is p = (11, 16, 21). 5

(h) (i) Given the vector (1, 3, −2), its x-, y-, and z-direction cosines are, respectively

(1, 3, −2) ⋅ (1, 0, 0) 1 (1, 3, −2) ⋅ (0, 1, 0) 3 (1, 3, −2) ⋅ (0, 0, 1) −2 =√ , = √ , and =√ . ∣(1, 3, −2)∣ ∣(1, 0, 0)∣ ∣(1, 3, −2)∣ ∣(0, 0, 1)∣ 14 ∣(1, 3, −2)∣ ∣(0, 1, 0)∣ 14 14

1 3 −2 Hence, its unit vector is ( √ , √ , √ ). 14 14 14

(ii) Given the vector (4, 2, −3), its x-, y-, and z-direction cosines are, respectively

(4, 2, −3) ⋅ (1, 0, 0) 4 (4, 2, −3) ⋅ (0, 1, 0) 2 (4, 2, −3) ⋅ (0, 0, 1) −3 =√ , = √ , and =√ . ∣(4, 2, −3)∣ ∣(1, 0, 0)∣ ∣(4, 2, −3)∣ ∣(0, 0, 1)∣ 29 ∣(4, 2, −3)∣ ∣(0, 1, 0)∣ 29 29

4 2 −3 Hence, its unit vector is ( √ , √ , √ ). 29 29 29

(iii) Given the vector (−1, 2, −4), its x-, y-, and z-direction cosines are, respectively

(−1, 2, −4) ⋅ (1, 0, 0) −1 (−1, 2, −4) ⋅ (0, 1, 0) 2 (−1, 2, −4) ⋅ (0, 0, 1) −2 =√ , =√ , =√ . ∣(−1, 2, −4)∣ ∣(1, 0, 0)∣ 21 ∣(−1, 2, −4)∣ ∣(0, 1, 0)∣ 21 ∣(−1, 2, −4)∣ ∣(0, 0, 1)∣ 21

−1 2 −2 Hence, its unit vector is ( √ , √ , √ ). 21 21 21

Page 1009, Table of Contents

www.EconsPhDTutor.com

88.4

Answers for Ch. 30: Vector Product

Answer to Exercise 119. (a) If u = (0, 1, 2) and v = (3, 4, 5), then u × v = (−3, 6, −3). Let’s verify that u × v is orthogonal to u, by computing (u × v) ⋅ u = (−3, 6, −3) ⋅ (0, 1, 2) = 0+6−6 = 0 ✓. Similarly, let’s verify that u×v is orthogonal to v, by computing (u × v)⋅v = (−3, 6, −3) ⋅ (3, 4, 5) = −9 + 24 − 15 = 0 ✓.

(b) If u = (−1, −2, −3) and v = (1, 0, 5), then u × v = (−10, 2, 2). Let’s verify that u × v is orthogonal to u, by computing (u × v) ⋅ u = (−10, 2, 2) ⋅ (−1, −2, −3) = 10 − 2 − 6 = 0 ✓. Similarly, let’s verify that u × v is orthogonal to v, by computing (u × v) ⋅ v = (−10, 2, 2) ⋅ (1, 0, 5) = −10 + 0 + 10 = 0 ✓. Answer to Exercise 120.

⎛ uy vz − uz vy ⎞ ⎛ ux ⎞ ⎟ ⎟ ⎜ (u × v) ⋅ u = ⎜ ⎜ uz vx − ux vz ⎟ ⋅ ⎜ uy ⎟ ⎝ ux vy − uy vx ⎠ ⎝ uz ⎠ = (uy vz − uz vy ) ux + (uz vx − ux vz ) uy + (ux vy − uy vx ) uz = ux uy vz − ux vy uz + vx uy uz − ux uy vz + ux vy uz − vx uy uz = ux uy vz − ux vy uz + vx uy uz − ux uy vz + ux vy uz − vx uy uz =0 ⎛ uy vz − uz vy ⎞ ⎛ vx ⎞ ⎟ ⎜ ⎟ (u × v) ⋅ v = ⎜ ⎜ uz vx − ux vz ⎟ ⋅ ⎜ vy ⎟ ⎝ ux vy − uy vx ⎠ ⎝ vz ⎠ = (uy vz − uz vy ) vx + (uz vx − ux vz ) vy + (ux vy − uy vx ) vz = vx uy vz − vx vy uz + vx vy uz − ux vy vz + ux vy vz − vx uy vz = vx uy vz − vx vy uz + vx vy uz − ux vy vz + ux vy vz − vx uy vz =0

Answer to Exercise 121. (a) Given u = (1, 2, 3) and v = (4, 5, 6), u × v = (−3, 6, −3) and v × u = (3, −6, 3) and hence u × v = −v × u.

(b) By definition,

⎛ uy vz − uz vy u×v=⎜ ⎜ uz vx − ux vz ⎝ ux vy − uy vx

⎞ ⎟ ⎟ ⎠

and

⎛ vy uz − vz uy v × u= ⎜ ⎜ vz ux − vx uz ⎝ vx uy − vy ux

⎞ ⎟ ⎟ ⎠

and thus in general, the 3D vector product is anti-commutative — u × v = −v × u. Page 1010, Table of Contents

www.EconsPhDTutor.com

88.5

Answers for Ch. 31: Lines

Answer to Exercise 122. (a) The line 5x − y − 1 = 0 can also be written as r = (0, −1) + λ(1, 5) (λ ∈ R). (b) The line x − 2y − 1 = 0 can also be written as r = (1, 0) + λ(2, 1) (λ ∈ R).

(c) The line y − 4 = 0 can also be written as r = (0, 4) + λ(1, 0) (λ ∈ R).

(d) The line x − 4 = 0 can also be written as r = (4, 0) + λ(0, 1) (λ ∈ R).

Answer to Exercise 123. (a) The line r = (−1, 3) + λ(1, −2) (λ ∈ R) has cartesian equations

Eliminating λ, we have y = −2x + 1 or

x = −1 + λ, y = 3 − 2λ. y−1 x = . −2 1

(b) The line r = (5, 6) + λ(7, 8) (λ ∈ R) has cartesian equations x = 5 + 7λ, y = 6 + 8λ.

Eliminating λ, we have y = 8x/7 + 2/7 or

y − 2/7 x = . 1 7/8

(c) The line r = (0, −3) + λ(3, 0) (λ ∈ R) has cartesian equations

Eliminating λ, we have y = −3. Page 1011, Table of Contents

x = 3λ, y = −3.

www.EconsPhDTutor.com

Answer to Exercise 124. (a) The line r = (−1, 1, 1) + λ(3, −2, 1) (λ ∈ R) has cartesian equations x+1 y−1 z−1 = = . 3 −2 1

That is, this is the line that contains the points (x, y, z) which satisfy the above two equations. (b) The line r = (5, 6, 1) + λ(7, 8, 1) (λ ∈ R) has cartesian equations x−5 y−6 z−1 = = . 7 8 1

That is, this is the line that contains the points (x, y, z) which satisfy the above two equations. (c) The line r = (0, −3, 1) + λ(3, 0, 1) (λ ∈ R) has cartesian equations y = −3,

x z−1 = . 3 1

That is, this is the line that contains the points (x, y, z) which satisfy the above two equations. (d) The line r = (9, 9, 9) + λ(1, 0, 0) (λ ∈ R) has cartesian equations y = 9, z = 9.

That is, this is the line that contains the points (x, y, z) which satisfy the above two equations. These are the points (λ, 9, 9), where λ can be any real.

Page 1012, Table of Contents

www.EconsPhDTutor.com

7x − 2 0.3y − 5 8z = = 5 7 7 x − 2/7 y − 50/3 z into 5 = 70 = 7 . And so this is also the line r = (2/7, 50/3, 0)+λ (5/7, 70/3, 7/8) (λ ∈ R). /7 /3 /8 Answer to Exercise 125. (a) We can transform the equations

(b) We can transform the equations 2x = 3y = 5z into line r = (0, 0, 0) + λ (1/2, 1/3, 1/5)(λ ∈ R).

x y z = = . And so this is also the 1/2 1/3 1/5

x − 4/17 y − 1/3 z 3y − 1 = 3z into 1 = 2 = 1 . And 2 /17 /3 /3 4 1 1 2 1 so this is also the line r = ( /17, /3, 0) + λ ( /17, /3, /3) (λ ∈ R). (c) We can transform the equations 17x − 4 =

x − 3 z − 2/5 x − 3 5z − 2 = , 3y = 11 into = 7 , y = 11/3. 2 7 2 /5 And so this is also the line r = (3, 0, 2/5) + λ (2, 0, 7/5) (λ ∈ R). (d) We can transform the equations

Answer to Exercise 126. (a) Given the points a = (3, 1, 2), b = (1, 6, 5), and c = (0, −1, 0), first take the line through a and b. The vector from a to b is (−2, 5, 3) and the line passes through a. Hence, the line can be written as r = (3, 1, 2) + λ(−2, 5, 3) (λ ∈ R). Then check whether c is on the line: Is there λ such that c = (0, −1, 0) = (3, 1, 2)+λ(−2, 5, 3)? Rearranging, we have (−3, −2, −2) = λ(−2, 5, 3), which we can write out as: −3 = −2λ,

− 2= 5λ,

−2 = 3λ.

Clearly, there is no λ such that the above three equations can be true. And so the point c is not on the line through a and b. Hence, the three points are not collinear. (b) Given the points a = (1, 2, 4), b = (0, 0, 1), and c = (3, 6, 10), first take the line through a and b. The vector from a to b is (−1, −2, −3) and the line passes through a. Hence, the line can be written as r = (1, 2, 4) + λ(−1, −2, −3) (λ ∈ R). Then check whether c is on the line: Is there λ such that c = (3, 6, 10) = (1, 2, 4) + λ(−1, −2, −3)? Rearranging, we have (2, 4, 6) = λ(−1, −2, −3), which we can write out as: 2 = −λ,

4= −2λ,

6 = −3λ.

Clearly, all three of the above equations are true if λ = −2. And so c is also on the line. Hence, the three points are collinear. Page 1013, Table of Contents

www.EconsPhDTutor.com

88.6

Answers for Ch. 32: Planes

Answer to Exercise 127. (a) The plane containing a = (7, 3, 4), b = (8, 3, 4), and Ð → → = (2, 0, 3). A normal vector c = (9, 3, 7) contains also the vectors ab = (1, 0, 0) and Ð ac is thus (1, 0, 0) × (2, 0, 3) = (0, −3, 0). Any scalar multiple of a normal vector is itself a normal vector, so why not we pick (0, 1, 0) as our normal vector. (Whenever we have the choice, we prefer vectors that involve as many 1’s and as few negative signs as possible, because this will simplify our calculations.) The plane can thus be described by the vector equation r ⋅ (0, 1, 0) = 3 or the cartesian equation y = 3.

(b) The plane containing a = (8, 0, 2), b = (4, 4, 3), and c = (2, 7, 2) contains also the vectors Ð → → = (−6, 7, 0). A normal vector is thus (−4, 4, 1)×(−6, 7, 0) = (−7, −6, −4). ab = (−4, 4, 1) and Ð ac Another normal vector is (7, 6, 4).

The plane can thus be described by the vector equation r ⋅ (7, 6, 4) = 64 or the cartesian equation 7x + 6y + 4z = 64.

(c) The plane containing a = (8, 5, 9), b = (8, 4, 5), and c = (5, 6, 0). contains also the Ð → → = (−3, 1, −9). A normal vector is thus (0, −1, −4) × (2, 0, 3) = vectors ab = (0, −1, −4) and Ð ac (5, −12, −3). Another normal vector is (−5, 12, 3).

The plane can thus be described by the vector equation r ⋅ (−5, 12, 3) = 47 or the cartesian equation −5x + 12y + 3z = 47.

Page 1014, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 128. (a) Given a plane with cartesian equation 3x + 2y + 5z = −3, we immediately know that it has vector equation is r ⋅ (3, 2, 5) = −3.

(b) Given a plane with cartesian equation 2y + 5z = −3, we immediately know that it has vector equation is r ⋅ (0, 2, 5) = −3.

(c) Given a plane with cartesian equation 5z = −3, we immediately know that it has vector equation is r ⋅ (0, 0, 5) = −3. Answer to Exercise 129. (a) r ⋅ (3, 6, 2) = 4 can be rewritten as r ⋅ (3/7, 6/7, 2/7) = 4/7.

(b) r ⋅ (1, 2, 2) = −1 can be rewritten as r ⋅ (−1/3, −2/3, −2/3) = 1/3. (c) r ⋅ (8, 1, 4) = 0 can be rewritten as r ⋅ (8/9, 1/9, 4/9) = 0.

Page 1015, Table of Contents

www.EconsPhDTutor.com

88.7

Answers for Ch. 33: Distances

Answer to Exercise 130 (a). Given the point a = (7, 3, 4) and the line l described → = (−1, 0, 0) and so ∣Ð → 2 = 12 = 1. Also, Ð →⋅ by r = (8, 3, 4) + λ(9, 3, 7), we have: Ð pa pa∣ pa (−1, 0, 0) ⋅ (9, 3, 7) 9 81 2 →⋅v ˆ = ˆ) = √ v = −√ and so (Ð pa . Hence, the length of the side is 2 + 32 + 72 139 139 9 √ √ 1 − 81/139 = 58/139 ≈ 0.646. This is the distance between point a and the line l. And 9 (9, 3, 7) √ 139 139 (81, 27, 63) = (8, 3, 4) − 139 (1031, 390, 493) . = 139

b = (8, 3, 4) − √

Not to scale.

a = (7, 3, 4) Distance between a and b is 0.646

l b = (7

,2

,3

)

p = (8, 3, 4)

Page 1016, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 130 (b). Given the point a = (8, 0, 2) and the line l described by → = (4, −4, −1) and so ∣Ð → 2 = 42 + 42 + 12 = 33. Also, r = (4, 4, 3) + λ(2, 7, 2), we have: Ð pa pa∣ (4, −4, −1) ⋅ (2, 7, 2) 484 22 2 Ð →⋅v →⋅v ˆ= ˆ) = √ pa pa = − √ and so (Ð . Hence, the length of the side is 2 + 72 + 22 57 57 2 √ √ 1397 484 33 − = ≈ 4.951. This is the distance between point a and the line l. And 57 57 22 (2, 7, 2) √ b = (4, 4, 3) − √ 57 57 22 (2, 7, 2) = (4, 4, 3) − 57 (184, 74, 127) . = 57

Not to scale.

a = (8, 0, 2) Distance between a and b is 3.170

l b = (1

,

,2

)

p = (4, 4, 3)

Page 1017, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 130 (c). Given the point a = (8, 5, 9) and the line l described → = (0, 1, 4) and so ∣Ð → 2 = 12 + 42 = 17. Also, by r = (8, 4, 5) + λ(5, 6, 0), we have: Ð pa pa∣ (0, 1, 4) ⋅ (5, 6, 0) 36 6 2 Ð →⋅v →⋅v ˆ = √ ˆ) = pa pa = √ and so (Ð . Hence, the length of the side is 2 + 62 + 02 61 61 5 √ √ 1001 36 17 − = ≈ 4.051. This is the distance between point a and the line l. And 61 61 6 (5, 6, 0) √ b = (8, 4, 5) + √ 61 61 6 (5, 6, 0) = (8, 4, 5) + 61 (518, 280, 305) . = 61

Not to scale.

a = (8, 5, 9) Distance between a and b is 4.051

l b = (8

,

, 5)

p = (8, 4, 5)

Page 1018, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 131 (a). Given the point a = (7, 3, 4) and the line √ l described by Ð → Ð → r = (8, 3, 4) + λ(9, 3, 7), we have: ra = (−1 − 9λ, 0 − 3λ, 0 − 7λ) and so ∣ra∣ = 139λ2 + 18λ + 1, which is minimised at 278λ + 18 = 0 or λ = −18/278 = −9/139.

9 58 112 76 (9, 3, 7) = (7 ,2 ,3 ). Moreover, the length of the line segment 139 139 139 139 √ √ 2 √ 9 58 9 ) + 18 (− )+1= . ab is 139λ2 + 18λ + 1 = 139 (− 139 139 139

So b = (8, 3, 4) −

Answer to Exercise 131 (b). Given the point a = (8, 0, 2) and the line √ l described by Ð → Ð → r = (4, 4, 3)+λ(2, 7, 2), we have: ra = (4−2λ, −4−7λ, −1−2λ) and so ∣ra∣ = 57λ2 + 44λ + 33, which is minimised at 114λ + 44 = 0 or λ = −44/114 = −22/57. 1 22 (2, 7, 2) = (184, 74, 127). Moreover, the length of the line segment ab 57 √ 57 √ 2 √ 22 22 1397 is 57λ2 + 44λ + 33 = 57 (− ) + 44 (− ) + 33 = . 57 57 57 So b = (4, 4, 3) −

Answer to Exercise 131 (c). Given the point a = (8, 5, 9) and√the line l described by → = (−5λ, 1 − 6λ, 4) and so ∣Ð → = 61λ2 − 12λ + 17, which r = (8, 4, 5) + λ(5, 6, 0), we have: Ð ra ra∣ is minimised at 122λ − 12 = 0 or λ = 12/122 = 6/61.

1 6 (5, 6, 0) = (518, 280, 335). Moreover, the length of the line segment ab 61 √ 61 √ 2 √ 6 6 1001 is 61λ2 − 12λ + 17 = 61 ( ) − 12 ( ) + 17 = . 61 61 61 So b = (8, 4, 5) +

Page 1019, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 132(a). Consider the point a = (7, 3, 4) and the plane r ⋅ (9, 3, 7) = 109. Convert the vector equation of the plane to Hessian normal form: r ⋅ (√

ˆ=√ So n

9 3 7 109 ,√ ,√ )= √ 139 139 139 139

109 100 1 ˆ=√ (9, 3, 7), dˆ = √ , and a ⋅ n . 139 139 139

9 ˆ∣ = √ Altogether then, the distance between the point and the plane is ∣dˆ − a ⋅ n and the 139 foot of the perpendicular is 1 9 ˆ) n ˆ = (7, 3, 4) + √ √ a + (dˆ − a ⋅ n (9, 3, 7) 139 139 (1054, 444, 619) . = 139

Ð → By the way, notice that in this example, n points in the same direction as ab. And so ̂ Ð → ˆ . And moreover, dˆ − a ⋅ n ˆ > 0. ab = n

a = (7, 3, 4) Not to scale.

Distance between a and b

Plane p = (8, 3, 4)

Page 1020, Table of Contents

b=

www.EconsPhDTutor.com

Answer to Exercise 132 (b). Consider the point a = (8, 0, 2) and the plane r⋅(2, 7, 2) = 42. Convert the vector equation of the plane to Hessian normal form: 2 7 2 42 r ⋅ (√ , √ , √ ) = √ . 57 57 57 57

1 42 20 ˆ = √ (2, 7, 2), dˆ = √ , and a ⋅ n ˆ=√ . So n 57 57 57

22 ˆ ∣ = √ and the Altogether then, the distance between the point and the plane is ∣dˆ − a ⋅ n 57 foot of the perpendicular is 22 (2, 7, 2) ˆ) n ˆ = (8, 0, 2) + √ √ a + (dˆ − a ⋅ n 57 57 (458, 154, 158) = . 57

Ð → By the way, notice that in this example, n points in the same direction as ab. And so ̂ Ð → ˆ . And moreover, dˆ − a ⋅ n ˆ > 0. ab = n

a = (8, 0, 2) Not to scale.

Distance between a and b

Plane p = (4, 4, 3)

Page 1021, Table of Contents

b=

www.EconsPhDTutor.com

Answer to Exercise 132 (c). Consider the point a = (8, 5, 9) and the plane r⋅(5, 6, 0) = 64. Convert the vector equation of the plane to Hessian normal form: 8 5 9 64 r ⋅ (√ , √ , √ ) = √ 61 61 61 61

ˆ= So n

70 (5, 6, 0) ˆ 64 ˆ=√ . √ , d = √ , and a ⋅ n 61 61 61

6 ˆ ∣ = √ and the Altogether then, the distance between the point and the plane is ∣dˆ − a ⋅ n 61 foot of the perpendicular is 6 (5, 6, 0) ˆ) n ˆ = (8, 5, 9) − √ √ a + (dˆ − a ⋅ n 61 61 (458, 269, 549) = . 61

Ð → By the way, notice that in this example, n points in the opposite direction from ab. And ̂ Ð → ˆ < 0. so ab = −ˆ n. And moreover, dˆ − a ⋅ n

a = (8, 5, 9) Not to scale.

Distance between a and b

Plane p = (8, 4, 5)

Page 1022, Table of Contents

b=

www.EconsPhDTutor.com

88.8

Answers for Ch. 34: Angles

(−1, 1) ⋅ (2, −3) −5 −5 −5 = √ √ = √ . So θ = cos−1 √ ≈ ∣(−1, 1)∣ ∣(2, −3)∣ 2 13 26 26 2.944. This is the obtuse angle between the two lines. So the acute angle between the two lines is π − 2.944 = 0.197. Answer to Exercise 133. (a)

(1, 5) ⋅ (8, 1) 13 13 1 1 √ =√ √ =√ = √ . So θ = cos−1 √ ≈ 1.249. This is ∣(1, 5)∣ ∣(8, 1)∣ 26 65 13 × 2 13 × 5 10 10 the acute angle between the two lines. (b)

(2, 6) ⋅ (3, 2) 9 9 9 18 . So θ = cos−1 √ ≈ 0.661. This is the =√ √ =√ √ =√ ∣(2, 6)∣ ∣(3, 2)∣ 40 13 10 13 130 130 acute angle between the two lines.

(c)

(−1, 1, 0) ⋅ (2, −3, 4) −5 −5 −5 = √ √ = √ . So θ = cos−1 √ ≈ ∣(−1, 1, 0)∣ ∣(2, −3, 4)∣ 2 29 58 58 2.287. This is the obtuse angle between the two lines. So the acute angle between the two lines is π − 2.287 = 0.855. Answer to Exercise 134. (a)

19 (1, 5, 6) ⋅ (8, 1, 1) 19 = √ √ . So θ = cos−1 √ √ ≈ 1.269. This is the acute angle ∣(1, 5, 6)∣ ∣(8, 1, 1)∣ 62 66 62 66 between the two lines. (b)

(2, 6, 7) ⋅ (3, 2, 1) 25 25 = √ √ . So θ = cos−1 √ √ ≈ 0.784. This is the acute angle ∣(2, 6, 7)∣ ∣(3, 2, 1)∣ 89 14 89 14 between the two lines. (c)

Page 1023, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 135. (a) The angle between the line r = (−1, 2, 3) + λ(−1, 1, 0) (λ ∈ R) and the plane r ⋅ (3, 4, 5) = 1 is: sin−1 ∣

v⋅n (−1, 1, 0) ⋅ (3, 4, 5) 1 ∣ = sin−1 ∣ ∣ = sin−1 ∣ √ √ ∣ = sin−1 (0.1) ≈ 0.100. ∣v∣ ∣n∣ ∣(−1, 1, 0)∣ ∣(3, 4, 5)∣ 2 50

(b) The angle between the line r = (−1, 2, 3) + λ(0, 2, 6) (λ ∈ R) (λ ∈ R) and the plane r ⋅ (1, 3, 5) = 2 is: sin−1 ∣

(0, 2, 6) ⋅ (1, 3, 5) 36 v⋅n ∣ = sin−1 ∣ ∣ = sin−1 ∣ √ √ ∣ ≈ 1.295. ∣v∣ ∣n∣ ∣(0, 2, 6)∣ ∣(1, 3, 5)∣ 40 35

(c) The angle between the line r = (−1, 2, 3) + λ(1, 9, 8) (λ ∈ R) and the plane r ⋅ (2, 8, 2) = 3 is: sin−1 ∣

(1, 9, 8) ⋅ (2, 8, 2) 90 v⋅n ∣ = sin−1 ∣ ∣ = sin−1 ∣ √ √ ∣ ≈ 1.071. ∣v∣ ∣n∣ ∣(1, 9, 8)∣ ∣(2, 8, 2)∣ 146 72

Page 1024, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 136. (a) Consider the planes r ⋅ (−1, −2, −3) = 1 and r ⋅ (3, 4, 5) = 2. The angle between them is θ = cos−1 (

(−1, −2, −3) ⋅ (3, 4, 5) −26 ) = cos−1 ( √ √ ) ≈ 2.955. ∣(−1, −2, −3)∣ ∣(3, 4, 5)∣ 14 50

This is the obtuse angle. So the acute angle between the two planes is π − 2.955 = 0.186 radian. (b) Consider the planes described, respectively, by the vector equations r ⋅ (1, 2, 3) = 3 and r ⋅ (5, 1, 1) = 4. The angle between them is θ = cos−1 (

10 (1, 2, 3) ⋅ (5, 1, 1) ) = cos−1 ( √ √ ) ≈ 1.031. ∣(1, −2, 3)∣ ∣(5, 1, 1)∣ 14 27

(c) Consider the planes described, respectively, by the vector equations r ⋅ (1, 1, −8) = 5 and r ⋅ (−3, 0, 10) = 6. The angle between them is θ = cos−1 (

(1, 1, −8) ⋅ (−3, 0, 10) −83 ) = cos−1 ( √ √ ) ≈ 2.934. ∣(1, 1, −8)∣ ∣(−3, 0, 10)∣ 66 109

This is the obtuse angle. So the acute angle between the two planes is π − 2.934 = 0.207 radian.

Page 1025, Table of Contents

www.EconsPhDTutor.com

88.9

Answers for Ch. 35: Relationships between Lines and Planes

Answer to Exercise 137. (a) The lines r = (8, 1, 5) + λ(3, 2, 1) and r = (1, 2, 3) + λ(5, 6, 7) (λ ∈ R) are not parallel because their direction vectors cannot be written as scalar multiples of each other. Let’s see if they have an intersection point. If they intersect, then there are reals α and β such that (8, 1, 5) + α(3, 2, 1) = (1, 2, 3) + β(5, 6, 7), or 8 + 3α = 1 + 5β, 1 + 2α = 2 + 6β, and 5 + α = 3 + 7β. 1

2

3

Take 2× = minus = to get (10 + 2α) − (1 + 2α) = (6 + 14β) − (2 + 6β) or 9 = 4 + 8β or β = 5/8. 3 1 Now from =, this means that α = 19/8. These do not work if we try plugging them into =. Hence, the two lines do not intersect. And so they are not coplanar — or equivalently, they are skew. 3

2

(b) The lines r = (0, 0, 6) + λ(3, 9, 0) and r = (1, 1, 1) + λ(1, 3, 0) (λ ∈ R) have direction vectors that can be written as scalar multiples of each other, so they are parallel. Thus, they are also coplanar — or equivalently, they are not skew. Remember: We need two distinct vectors and a point to determine a plane. Here, the direction vectors of the two lines are not distinct. So we need another vector. This is not difficult. Simply consider a vector from a point on the first line to the second, e.g. (1, 1, 1) − (0, 0, 6) = (1, 1, −5). Now, the plane that contains both lines has normal vector (1, 1, −5)×(1, 3, 0) = (15, −15, 2). And so the plane that contains both lines is r⋅(15, −15, 2) = 12. (c) The lines r = (6, 5, 5) + λ(1, 0, 1) and r = (8, 3, 6) + λ(0, 1, 1) (λ ∈ R) are not parallel are not parallel because their direction vectors cannot be written as scalar multiples of each other. Let’s see if they have an intersection point. If they intersect, then there are reals α and β such that (6, 5, 5) + α(1, 0, 1) = (9, 3, 6) + β(0, 1, 1), or 6 + α = 9, 5 = 3 + β, and 5 + α = 6 + β. 1

2

3

From = − =, we have α = 3, β = 2. This is consistent with =. And so α = 3, β = 2 is a possible solution for the above set of equations. Hence, the two lines do intersect — in particular at (6, 5, 5) + α(1, 0, 1) = (9, 5, 8) or (9, 3, 6) + β(0, 1, 1) = (9, 5, 8). Thus, they are also coplanar — or equivalently, they are not skew. 3

2

1

Remember: We need two distinct vectors and a point to determine a line. Here, the direction vectors of the two lines are distinct. So a plane that contains both lines has normal vector (1, 0, 1) × (0, 1, 1) = (−1, −1, 1). And so the plane that contains both lines is r ⋅ (−1, −1, 1) = −6. Page 1026, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 138. (a) Given the line r = (4, 5, 6) + λ(2, 3, 5) (λ ∈ R) and the plane r ⋅ (−10, 0, 4) = −26, we have (2, 3, 5) ⋅ (−10, 0, 4) = 0 and so they are parallel.

The point (4, 5, 6) on the line is not on the plane, as we can easily verify — (4, 5, 6) ⋅ (−10, 0, 4) = −16 ≠ −26. And so they do not intersect at all.

(b) Given the line r = (5, 5, 6) + λ(2, 3, 5) (λ ∈ R) and the plane r ⋅ (−10, 0, 4) = −26, we have (2, 3, 5) ⋅ (−10, 0, 4) = 0 and so they are parallel.

The point (5, 5, 6) on the line is on the plane because (5, 5, 6) ⋅ (−10, 0, 4) = −26. Since the line and plane are parallel and share at least one intersection point, it must be that the line lies completely on the plane. (c) Given the line r = (4, 5, 6) + λ(2, 3, 5) (λ ∈ R) and the plane r ⋅ (−10, 0, 3) = −26, we have (2, 3, 5) ⋅ (−10, 0, 3) ≠ 0 and so they are not parallel. They must therefore intersect at exactly one point. Let’s find it.

Plug in a generic point of the line into the equation for the plane: (4 + 2λ, 5 + 3λ, 6 + 5λ) ⋅ (−10, 0, 3) = −26 ⇐⇒ −40 − 20λ + 18 + 15λ = −26 ⇐⇒ 4 = 5λ ⇐⇒ λ = 4/5. So the intersection point is (4, 5, 6) + 4/5(2, 3, 5) = (5.6, 7.4, 10).

Page 1027, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 139. (a) The planes are r ⋅ (4, 9, 3) = 61 and r ⋅ (1, 1, 2) = 19. Clearly, (4, 9, 3) cannot be written as a scalar multiple of (6, 2, 8). So the two planes are not parallel and share an intersection line that has direction vector (4, 9, 3) × (1, 1, 2) = (15, −5, −5) or (−3, 1, 1). Find a point (x, y, z) where the two planes intersect:

4x + 9y + 3z = 61, 1

x + y + 2z = 19. 2

Use the “plug in x = 0” trick. Then = minus 9× = yields −15z = −110 or z = 22/3. And so y = 13/3. Hence, the intersection line is r = (0, 13/3, 22/3) + λ(−3, 1, 1) (λ ∈ R). 1

2

(b) The planes are r ⋅ (1, 1, 0) = 4 and r ⋅ (1, 6, 8) = 60. Clearly, (1, 1, 0) cannot be written as a scalar multiple of (1, 6, 8). So the two planes are not parallel and share an intersection line that has direction vector (1, 1, 0) × (1, 6, 8) = (8, −8, 5). Find a p = (x, y, z) where the two planes intersect:

x + y = 4, 1

x + 6y + 8z = 60. 2

Use the “plug in x = 0” trick. Then = implies that y = 4. And so from =, z = 4.5. And so one intersection point of the two planes is (0, 4, 4.5). Hence, the intersection line is r = (0, 4, 4.5) + λ(8, −8, 5) (λ ∈ R). 1

2

(c) The planes are r ⋅ (4, 4, 8) = 56 and r ⋅ (1, 1, 2) = 12. Clearly, (4, 4, 8) can be written as a scalar multiple of (1, 1, 2). So the two planes are parallel. Do they not intersect at all or are they identical?

The point (1, 3, 5) is on the first plane, but is not on the second because (1, 3, 5) ⋅ (1, 1, 2) = 14 ≠ 12. And so they are parallel planes that do not intersect at all.

(d) The planes are r ⋅ (4, 4, 8) = 48 and r ⋅ (1, 1, 2) = 12. Clearly, (4, 4, 8) can be written as a scalar multiple of (1, 1, 2). So the two planes are parallel. Do they not intersect at all or are they identical?

The point (1, 3, 4) is on the first plane and is also on the second because (1, 3, 4)⋅(1, 1, 2) = 12. And so they are exactly identical planes. Page 1028, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 140. The planes P1 , P2 , and P3 are r ⋅ (1, 0, 1) = 1, r ⋅ (0, 1, −1) = −1, and r ⋅ (1, 1, 0) = 2. Step #1. Check if any two planes are parallel.

By observation, no plane’s normal vector can be written as a scalar multiple of another plane’s normal vector. So no two planes are parallel. (So we are in Case 3.) Step #2. Find the 3 intersection lines along which each pair of planes intersect.

The planes P1 and P2 share an intersection line with direction vector (1, 0, 1) × (1, 1, 0) = (−1, 1, 1). Let’s look for an intersection point (x, y, z): x + z = 1, 1

y − z = −1. 2

Use the “plug in x = 0” trick to see that one intersection point is (0, 0, 1). Hence the intersection line of P1 and P2 is r = (0, 0, 1) + λ(−1, 1, 1) (λ ∈ R). Call this line l1 . The planes P1 and P3 share an intersection line with direction vector (1, 0, 0) × (1, 1, 0) = (−1, 1, 1). Let’s look for an intersection point (x, y, z): x + z = 1, 1

x + y = 2. 2

Use the “plug in x = 0” trick to see that one intersection point is (0, 2, 1). Hence the intersection line of P1 and P3 is r = (0, 2, 1) + λ(−1, 1, 1) (λ ∈ R). Call this line l2 .

The planes P2 and P3 share an intersection line with direction vector (0, 1, −1) × (1, 1, 0) = (1, −1, −1) or (−1, 1, 1). Let’s look for an intersection point (x, y, z): y − z = −1, 1

x + y = 2. 2

Use the “plug in x = 0” trick to see that one intersection point is (0, 2, 3). Hence the intersection line of P2 and P3 is r = (0, 2, 3) + λ(−1, 1, 1) (λ ∈ R). Call this line l3 . Step #3. Determine where, if at all, the 3 intersection lines intersect.

Clearly, all 3 lines are parallel (because they all have the same direction vector). But l1 and l2 are distinct, because (0, 0, 1) is on l1 but not on l2 . Thus, we must be in Case 3a, where all 3 lines are distinct and do not intersect. Conclusion. Altogether, we conclude that the 3 intersection lines do not intersect. The 3 planes form 3 distinct intersection lines. (So we are in Case 3a.) Page 1029, Table of Contents

www.EconsPhDTutor.com

89 89.1

Answers to Exercises in Part IV: Complex Numbers Answers for Ch. 36: Introduction to Complex Numbers

Answer to Exercise 141. Is this ... 13 − 2i A complex number? Yes A real number? No An imaginary number? Yes A purely imaginary number? No The imaginary unit? No

√ 3i Yes No Yes Yes No

0 Yes Yes No No No

4 4 + 2i Yes Yes Yes No No Yes No No No No

i Yes No Yes Yes Yes

√ 3 Yes Yes No No No

Answer to Exercise 142. We rewrite each, by rationalising any denominators with surds and also writing out the sine or cosine values. √ √ √ √ √ √ √ √ 2 2 3 2 2 3 2 3 2 a=( )−( ) i, b = ( )−( ) i, c = ( )−( ) i, and d = ( )−( ) i. 2 2 2 2 2 2 2 2 Comparing the real and imaginary parts, we see that only c = d.

Page 1030, Table of Contents

www.EconsPhDTutor.com

89.2

Answers for Ch. 37: Basic Arithmetic of Complex Numbers

Answer to Exercise 143. (a) Given z = −5 + 2i and w = 7 + 3i, we have z + w = 2 + 5i and z − w = −12 − i.

(b) Given z = 3 − i and w = 11 + 2i, we have z + w = 14 + i and z − w = −8 − 3i. √ √ √ (c) Given z = 1+2i and w = 3− 2i, we have z +w = 4+(2 − 2) i and z −w = −2+(2 + 2) i.

Answer to Exercise 144. If z = a + bi and w = c + di, then

zw = (a + bi)(c + di) = ac + adi + bci + bdi2 = ac − bd + (ad + bc)i = (ac − bd, ad + bc) ,

as desired.

Answer to Exercise 145. (a) Given z = −5 + 2i and w = 7 + 3i, we have zw = (−5 + 2i)(7 + 3i) = −35 − 15i + 14i + 6i2 = −41 − i.

(b) Given z = 3 − i and w = 11 + 2i, we have zw = (3 − i)(11 + 2i) = 33 + 6i − 11i − 2i2 = 35 − 5i. √ √ √ √ (c) Given z = 1 + 2i and w = 3 − 2i, we have zw = (1 + 2i)(3 − 2i) = 3 − 2i + 6i − 2 2i2 = √ √ 3 + 2 2 + (6 − 2) i. Answer to Exercise 146. (2 + i)2 = 4 − 1 + 4i = 3 + 4i.

(2 + i)3 = (2 + i)(2 + i)2 = (2 + i)(3 + 4i) = 6 + 8i + 3i − 4 = 2 + 11i.

Hence, az 3 + bz 2 + 3z − 1 = a(2 + i)3 + b(2 + i)2 + 3z − 1 = a(2 + 11i) + b(3 + 4i) + 3(2 + i) − 1 = 2a + 3b + 5 + (11a + 4b + 3)i. Two complex numbers are equal if and only if their real and 1 2 2 1 imaginary parts are equal. So 2a + 3b + 5 = 0 and 11a + 4b + 3 = 0. Take 3× = minus 4× = to get 33a + 9 − (8a + 20) = 25a − 11 = 0. So a = 11/25 and b = −49/25. Answer to Exercise 147. (a) z ∗ = −5 − 2i. So 1/z = z ∗ / (52 + 22 ) = (−5 − 2i)/29. (b) z ∗ = 3 + i. So

(c) z ∗ = 1 − 2i. So

1 z∗ 3+i = 2 2= . z 3 +1 10

1 z∗ 1 − 2i = 2 2= . z 1 +2 5

Page 1031, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 148. 1 + 3i 1 + 3i i i + 3i2 i − 3 (a) = × = = = −3 + i. −i −i i −i2 1

(b)

2 − 3i 2 − 3i 1 − i 2 − 3i − 2i + 3i2 −1 − 5i = × = = = −0.5 − 2.5i. 1+i 1+i 1−i 1 − i2 2

√ √ √ √ √ √ 2 − πi 2 − πi 3 + 2i 3 2 + 2i − 3πi − π 2i2 2(3 + π) + (2 − 3π)i √ = √ × √ = (c) = . 2 9 − 2i 11 3 − 2i 3 − 2i 3 + 2i

(d)

(e) (f)

11 + 2i 11 + 2i −i −11i − 2i2 −11i + 2 = × = = = 2 − 11i. i i −i −i2 1 −3 −3 2 − i −6 + 3i −6 + 3i = × = = = −1.2 + 0.6i. 2+i 2+i 2−i 4 − i2 5

7 − 2i 7 − 2i 5 − i 35 − 7i − 10i + 2i2 33 − 17i = × = = . 5+i 5+i 5−i 25 − i2 26

Page 1032, Table of Contents

www.EconsPhDTutor.com

89.3

Answers for Ch. 38: Solving Polynomial Equations

Answer to Exercise 149. (a) The roots to the equation x2 + x + 1 = 0 are given by x=

−b ±

√

√ √ √ √ b2 − 4ac −1 ± −3 1 3 × −1 1 3 = =− ± =− ± i. 2a 2 2 2 2 2

(b) The roots to the equation x2 + 2x + 2 = 0 are given by x=

−b ±

√ √ √ √ b2 − 4ac −2 ± −4 4 × −1 = = −1 ± = −1 ± i. 2a 2 2

(c) The roots to the equation 3x2 + 3x + 1 = 0 are given by x=

−b ±

√

√ √ √ √ 1 1 b2 − 4ac −3 ± −3 3 × −1 3 = =− ± =− ± i. 2a 6 2 6 2 6

Answer to Exercise 150. If 1 − i is a root to the quadratic equation x2 + bx + c = 0, then so too is 1 + i. And so x2 + bx + c = [x − (1 − i)] [x − (1 + i)] = (x − 1)2 − (i)2 = x2 − 2x + 2.

Hence, b = −2 and c = 2.

Page 1033, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 151. (a) x2 +2x +2 x − 1 x3 +x2 +0 −2 x3 −x2 2x2 2x2 −2x 2x −2 2x −2 0.

So x3 +x2 −2 = (x − 1) (x2 + 2x + 2). Use the quadratic formula to further factorise x2 +2x+2: x=

−2 ±

√

√ 22 − 4(1)(2) = −1 ± 1 − 2 = −1 ± i. 2

Hence, x3 + x2 − 2 = (x − 1) (x2 + 2x + 2) = (x − 1) [x − (−1 + i)] [x − (−1 − i)]. The three zeros of the polynomial are 1 and −1 ± i. x3 +x2 −2 x − 1 x4 +0 −x2 −2x +2 x4 −x3 x3 −x2 (b) x3 −x2 0 −2x +2 −2x +2 0.

So x4 − x2 − 2x + 2 = (x − 1) (x3 + x2 − 2). From part (a), we know that x3 + x2 − 2 = (x − 1) [x − (−1 + i)] [x − (−1 − i)].

So x4 − x2 − 2x + 2 = (x − 1)2 [x − (−1 + i)] [x − (−1 − i)]. So the four zeros of the polynomial are 1 (repeated) and −1 ± i.

Page 1034, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 152. (a) Since 2 − 3i is a zero and the polynomial has real coefficients, by the complex conjugate roots theorem, 2 + 3i is also a zero. Hence, a factor for the polynomial is[x − (2 − 3i)] [x − (2 + 3i)] = (x−2)2 −(3i)2 = x2 −4x+4+9 = x2 −4x+13. Now do the long division: x2 −2x −3 2 4 3 x − 4x + 13 x −6x +18x2 x4 −4x3 +13x2 −2x3 5x2 −2x3 8x2 −3x2 −3x2

−14x −39

−14x −26x +12x −39 +12x −39 0.

So the polynomial also has quadratic factor x2 − 2x − 3, which we observe can in turn be factorized as (x − 3)(x + 1). So the four zeros of the polynomial are 2 ± 3i, 3, and −1. And if we want to write down the four factors of the polynomial, we can easily do so: x4 − 6x3 + 18x2 − 14x − 39 = [x − (2 − 3i)] [x − (2 + 3i)] (x − 3)(x + 1).

(b) Again, we know that a quadratic factor for the polynomial is x2 − 4x + 13, so go ahead and do the long division: −2x2 +13x x2 − 4x + 13 −2x4 +21x3 −2x4 +8x3 13x3 13x3

−15 −93x2 +229x −195 −26x2 −67x2 +229x −52x2 +169x −15x2 +60x −195 −15x2 +60x −195 0.

So the other quadratic factor is −2x2 +13x−15. If it’s not obvious how this can be factorized, then go ahead and use the quadratic formula: x=

−13 ±

√

132 − 4(−2)(−15) 13 ∓ = 2(−2)

√ 169 − 120 13 ∓ 7 = = 5, 1.5. 4 4

So the four zeros of the polynomial are 2 ± 3i, 5, and 1.5.

By the way, if you need to write down the four factors of the polynomial, take care to note that −2x2 + 13x − 15 ≠ (x − 5)(x − 1.5) = x2 − 6.5x + 7.5. Instead, −2x2 + 13x − 15 = −2(x − 5)(x − 1.5). Page 1035, Table of Contents

www.EconsPhDTutor.com

89.4

Answers for Ch. 39: The Argand Diagram

Answer to Exercise 153.

5

y

4 2i = (0, 2) 3 1 + 2i = (1, 2)

2 1 -3 = (-3, 0)

1 = (1, 0)

x

0 -5

-4

-3

-2

-1

0

1

2

3

4

5

-1 -2 -3 -1 - 3i = (-1, -3) -4 -5

Page 1036, Table of Contents

www.EconsPhDTutor.com

√ Answer to Exercise 154. ∣4∣ = 42 + 02 = 4, arg 4 = 0. √ ∣−3∣ = (−3)2 + 02 = 3, arg(−3) = π. √ ∣2i∣ = 02 + 22 = 2, arg(2i) = π/2.

∣1 + 2i∣ =

√

∣−1 − 3i∣ =

12 + 22 = 5, arg(1 + 2i) = tan−1

√

(−1)2 + (−3)2 =

2 ≈ 1.107 rad. 1

√ −3 − π ≈ −1.893 rad. 10, arg(−1 − 3i) = tan−1 −1

5

y

4 3 2i = (0, 2) 2

1 + 2i = (1, 2)

1

1.107 rad

-3 = (-3, 0)

x 0

-5

-4

-3

-2

-1

0

1

2

3

-1

4 5 4 = (4, 0)

-2 -3 -1 - 3i = (-1, -3) -4

-1.893 rad

-5

Page 1037, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 155. A complex number with ... (a) positive argument is either in the top-right or in the top-left quadrant. (b) negative argument is in either in the bottom-left or in the bottom-right quadrant. (c) argument 0 is on the positive x-axis. (d) argument π/2 is on the positive y-axis. (e) argument −π/2 is on the negative y-axis.

(f) argument > π/2 is in the top-left quadrant.

(g) argument < −π/2 is in the bottom-left quadrant. Answer to Exercise 156. We already calculated the modulus and arguments of these complex numbers in Exercise 154. So π π 1 = cos 0 + i sin 0. −3 = 3(cos π + i sin π). 2i = 2 [cos + i sin ]. 1 + 2i ≈ 5(cos 1.107 + 2 2 √ i sin 1.107). −1 − 3i ≈ 10 [cos(−1.893) + i sin(−1.893)]. Answer to Exercise 157. We already calculated the modulus and arguments of these complex numbers in Exercise 154. So √ 1 = ei0 . −3 = 3eiπ . 2i = 2eiπ/2 . 1 + 2i ≈ 5ei(1.107) . −1 − 3i ≈ 10ei(−1.893) .

Page 1038, Table of Contents

www.EconsPhDTutor.com

89.5

Answers for Ch. 40: More Arithmetic of Complex Numbers

Answer to Exercise 158. (a) z = 1 has modulus 1 and argument 0. w = −3 has modulus 3 and argument π. Hence, ∣zw∣ = 1 × 3 = 3,

So,

arg (zw) = 0 + π + 2kπ = π + 2kπ = π, k = 0. zw = 3 (cos π + i sin π) = 3eiπ .

(You can easily verify that 3eiπ = −3, which is indeed equal to 1 × (−3).)

(b) z = 2i has modulus 2 and argument π/2. w = 1 + 2i has modulus tan−1 (2/1). Hence, ∣zw∣ = 2 ×

So,

√

√

5 and argument

5,

arg (zw) = π/2 + tan−1 2 + 2kπ ≈ 2.678 + 2kπ = 2.678, k = 0. √ √ zw = 2 5 (cos 2.678 + i sin 2.678) = 2 5ei(2.678) .

√ (c) z = −1 − 3i has modulus 10 and argument tan−1 (−3/ − 1) − π. w = 3 + 4i has modulus 5 and argument tan−1 (4/3). Hence, ∣zw∣ =

So,

√

10 × 5,

arg (zw) = tan−1 3 − π + tan−1

4 + 2kπ ≈ −0.965 + 2kπ = −0.965, k = 0. 3

√ √ zw = 5 10 [cos (−0.965) + i sin (−0.965)] = 5 10ei(−0.965) .

Page 1039, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 159. (a) z = 1 has modulus 1 and argument 0. w = −3 has modulus 3 and argument π. Hence, z 1 ∣ ∣= , w 3

z arg ( ) = 0 − π + 2kπ = −π + 2kπ = π, k = 1. w So,

z cos π + i sin π eiπ = = . w 3 3

(You can easily verify that eiπ /3 = −1/3, which is indeed equal to −1/3.)

(b) z = 2i has modulus 2 and argument π/2. w = 1 + 2i has modulus tan−1 (2/1). Hence,

√

5 and argument

√ √ z 2 2 5 ∣ ∣= √ = = 0.4 5, w 5 5

z π arg ( ) = − tan−1 2 + 2kπ ≈ 0.464 + 2kπ = 0.464, k = 0. w 2 √ √ z = 0.4 5 (cos 0.464 + i sin 0.464) = 0.4 5ei(0.464) . w

So,

√ (c) z = −1 − 3i has modulus 10 and argument tan−1 (−3/ − 1) − π. w = 3 + 4i has modulus 4 5 and argument tan−1 . Hence, 3 √ √ z 10 ∣ ∣= = 0.2 10, w 5 4 z arg ( ) = tan−1 3 − π − tan−1 + 2kπ ≈ −2.820 + 2kπ = −2.820, k = 0. w 3

So,

√ √ z = 0.2 10 [cos (−2.820) + i sin (−2.820)] = 0.2 10ei(−2.820) . w

Page 1040, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 160. z + w = 3ei(0.2π) + 3ei(−0.9π) = 3ei(−0.35π) (ei(0.55π) + ei(−0.55π) ) = 3ei(−0.35π) (2 cos(0.55π)) . arg (z + w) = arg [3ei(−0.35π) (2 cos(0.55π))]

= arg 3 + arg ei(−0.35π) + arg 2 + arg [2 cos(0.55π)] = 0 − 0.35π + 0 + 0 + 2kπ = −0.35π (k = 0).

And

∣z + w∣ = ∣3ei(−0.35π) (2 cos(0.55π))∣ = ∣3∣ ∣e−0.35iπ ∣ ∣2∣ ∣cos(0.55π)∣ = 3 × 1 × 2 × cos(0.55π) = 6 cos(0.55π).

Thus,

z + w = 6 cos(0.55π) [cos (−0.35π) + i sin (−0.35π)] = 6 cos(0.55π)ei(−0.35π) .

Similarly, z − w = 3ei(0.2π) − 3ei(−0.9π) = 3ei(−0.35π) (ei(0.55π) − ei(−0.55π) ) = 3ei(−0.35π) (2i sin(0.55π)) arg (z − w) = arg [3ei(−0.35π) (2i sin(0.55π))]

= arg 3 + arg ei(−0.35π) + arg 2 + arg i + arg [2 sin(0.55π)] = 0 − 0.35π + 0 + π/2 + 0 + 2kπ = 0.15π (k = 0).

And,

∣z − w∣ = ∣3ei(−0.35π) (2i sin(0.55π))∣

= ∣3∣ ∣e−0.35iπ ∣ ∣2i∣ ∣sin(0.55π)∣ = 3 × 1 × 2 × sin(0.55π) = 6 sin(0.55π).

Thus, z − w = 6 sin(0.55π) [cos (0.15π) + i sin (0.15π)] = 6 sin(0.55π)ei(0.15π) .

We can also verify that a blind application of the formulae given in Fact 54 will give the same answers: 0.2π − (−0.9)π i(0.2π−0.9π)/2 e = 6 cos(0.55π)ei(−0.35π) , 2 0.2π − (−0.9)π 3ei(0.2π) − 3ei(−0.9π) = 3 × 2 sin ei(0.2π−0.9π+π)/2 = 6 sin(0.55π)ei(0.15π) . 2

3ei(0.2π) + 3ei(−0.9π) = 3 × 2 cos

Page 1041, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 161. eiθ + eiφ = ei Ô⇒

(ei

θ+φ 2

θ−φ 2

+ ei(−

arg (eiθ + eiφ ) = arg [ei

= arg (ei

θ+φ 2

θ+φ 2

θ−φ ) 2

) = ei

(2 cos

θ+φ 2

(2 cos

θ−φ )] 2

) + arg 2 + arg (cos

θ+φ + 2kπ, 2

=

θ−φ ). 2 θ−φ ) + 2kπ 2

where k is the unique integer such that 0.5(θ + φ) + 2kπ ∈ (−π, π]. And θ+φ θ−φ θ−φ )∣ = ∣ei 2 ∣ ∣2∣ ∣cos ∣ 2 2 θ−φ θ−φ = 2 cos . = 1 × 2 × cos 2 2

∣eiθ + eiφ ∣ = ∣ei

θ+φ 2

(2 cos

θ−φ θ+φ That is, eiθ + eiφ has modulus 2 cos and argument . In other words, eiθ + eiφ = 2 2 θ − φ i( θ+φ ) 2 cos e 2 , as desired. Similarly, 2 eiθ − eiφ = ei

Ô⇒

θ+φ 2

(ei

θ−φ 2

− ei(−

arg (eiθ − eiφ ) = arg [ei

θ+φ 2

θ−φ ) 2

) = ei

(2i sin

θ+φ 2

(2i sin

θ−φ )] 2

θ−φ ). 2

θ−φ ) + 2mπ 2 θ+φ π θ+φ+π + + 0 + 2mπ = + 2mπ, = 2 2 2 = arg (ei

θ+φ 2

) + arg(2i) + arg (sin

where m is the unique integer such that 0.5(θ + φ + π) + 2mπ ∈ (−π, π]. And θ+φ θ−φ θ−φ )∣ = ∣ei 2 ∣ ∣2i∣ ∣sin ∣ 2 2 θ−φ θ−φ = 1 × 2 × sin = 2 sin . 2 2

∣eiθ − eiφ ∣ = ∣ei

θ+φ 2

That is, eiθ − eiφ has modulus 2 sin θ − φ i( θ+φ ) 2 sin e 2 , as desired. 2 Page 1042, Table of Contents

(2i sin

θ−φ θ+φ+π and argument . In other words, eiθ − eiφ = 2 2 www.EconsPhDTutor.com

89.6

Answers for Ch. 41: Geometry of Complex Numbers

This chapter had no exercises.

Page 1043, Table of Contents

www.EconsPhDTutor.com

89.7

Answers for Ch. 42: Loci Involving Cartesian Equations

Answer to Exercise 162 (a) - (d).

y {(x, y): (x – a)2 + (y – b)2 = r 2}

y {(x, y): (x – a)2 + (y – b)2 ≤ r 2} (b)

Radius r

Radius r

(a)

(a, b)

(a, b)

x

x

y {(x, y): (x – a)2 + (y – b)2 < r 2}

(d)

y

(c) Radius r

{(x, y): (x – a)2 + (y – b)2 ≥ r2}

(a, b)

(a, b) Radius r x

Page 1044, Table of Contents

x

www.EconsPhDTutor.com

Answer to Exercise 162 (e).

(e)

y

{(x, y): (x – a)2 + (y – b)2 > r2} (a, b) Radius r x

Page 1045, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 163. (a) The line equidistant to the points (1, 4) and (−5, 0) has equation y = −1.5x − 1, as we now show through two methods: M ethod #1. ∣(x − 1, y − 4)∣ = ∣(x − (−5), y − 0)∣ ⇐⇒ (x − 1)2 + (y − 4)2 = (x + 5)2 + y 2 ⇐⇒ x2 − 2x + 1 + y 2 − 8y + 16 = x2 + 10x + 25 + y 2 ⇐⇒ −2x + 1 − 8y + 16 = 10x + 25 ⇐⇒ −12x − 8y − 8 = 0 ⇐⇒ y = −1.5x − 1.

Method #2. (This second method assumes we already know that the given locus is a line.) The desired line is perpendicular to the line connecting the points (1, 4) and (−5, 0). The 0−4 latter line has slope = 2/3 and midpoint (−2, 2). −5 − 1 The desired line thus has equation y = −1.5x + c. The desired line passes through the midpoint (−2, 2) — hence, 2 = −1.5(−2) + c Ô⇒ c = 2 − 3 = −1. Altogether then, the desired line has equation y = −1.5x − 1.

(b) {(x, y) ∶ ∣(x − 17, y − 3)∣ = ∣(x + 2, y + 11)∣} is the line that is equidistant from the points (17, 3) and (−2, −11). M ethod #1. ∣(x − 17, y − 3)∣ = ∣(x + 2, y + 11)∣ ⇐⇒ (x − 17)2 + (y − 3)2 = (x + 2)2 + (y + 11)2 ⇐⇒ x2 − 34x + 289 + y 2 − 6y + 9 = x2 + 4x + 4 + y 2 + 22y + 121 ⇐⇒ −34x + 289 − 6y + 9 = 4x + 4 + 22y + 121 ⇐⇒ 0 = 38x + 28y − 173.

The equation ∣(x − 17, y − 3)∣ = ∣(x + 2, y + 11)∣ can be rewritten as 38x + 28y − 173 = 0 or 173 19 y =− x+ . 14 28

Method #2. (This second method assumes we already know that the given locus is a line.)

The desired line is perpendicular to the line connecting the points (17, 3) and (−2, −11). −11 − 3 −14 The latter line has slope = and midpoint (7.5, −4). −2 − 17 −19 19 The desired line thus has equation y = − x + c. The desired line passes through the 14 19 19 285 173 midpoint (7.5, −4) — hence, −4 = − (7.5) + c Ô⇒ c = −4 + (7.5) = −4 + = . 14 14 28 28 19 173 Altogether then, the desired line has equation y = − x + . 14 28 Page 1046, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 164. The locus {(x, y) ∶ x2 + y 2 = 1, x > 0} describes the right half (of the circumference) of the unit circle centred on the origin. It is illustrated in green in the figure below. Note that it excludes the endpoints (indicated by black circles).

y {(x, y): x = 0}

{(x, y): x2 + y2 = 1} x

{(x, y): x2 + y2 = 1, x > 0}

Page 1047, Table of Contents

www.EconsPhDTutor.com

89.8

Answers for Ch. 43: Loci Involving Complex Equations

Answer to Exercise√165. (a) Let z = (x, y). Then ∣z∣ = ∣z∣ = r is equivalent to x2 + y 2 = r or x2 + y 2 = r2 .

√

x2 + y 2 and so the equation

So the locus {z ∈ C ∶ ∣z∣ ≤ r} describes the circumference of the circle of radius r centred on the origin. √ 2 2 (b) Let z = (x, y) and c = (a, b). Then ∣z − c∣ = ∣(x − a, y − b)∣ = (x − a) + (y − b) and so √ 2 2 2 2 the equation ∣z − c∣ = r is equivalent to (x − a) + (y − b) = r or (x − a) + (y − b) = r2 .

So the locus {z ∈ C ∶ ∣z − c∣ ≤ r} describes the circumference of the circle of radius r centred on c. (c) The locus {z ∈ C ∶ ∣z − c∣ ≤ r} describes the entire interior of the circle centred on c, with radius r, including the circumference of the circle. (d) The locus {z ∈ C ∶ ∣z − c∣ < r} describes the entire interior of the circle centred on c, with radius r, excluding the circumference of the circle. Answer to Exercise 166. (a) ∣z − c∣ ≤ ∣z − b∣ is the closed half-plane of points that are at least as close to point c as to point b.

(b) ∣z − c∣ < ∣z − b∣ is the open half-plane of points that are at least as close to point c as to point b. (c) ∣z − c∣ ≥ ∣z − b∣ is the closed half-plane of points that are at least as close to point b as to point c. (d) ∣z − c∣ > ∣z − b∣ is the open half-plane of points that are at least as close to point b as to point c.

Answer to Exercise 167. The locus {z ∈ C ∶ ∣z∣ = 1, −π < arg z < 0} describes the lower half of the circumference of the unit circle centred on the origin, excluding the endpoints on the horizontal axis.

Page 1048, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 168 (a) - (b). ∣z − 2 − 2i∣ = 1 describes a unit circle centred on the point C = (2, 2). A quick sketch is helpful. y

F |z - 2 - 2i | = 1

U A

C

N D

O Ʌ is the angle the line y = x makes with the positive x-axis.

x

By the properties of the circle, ∣z∣ is maximised at F and minimised at N , where F and N lie on the line through the origin and the circle’s centre. √ √ (a) The maximum value of ∣z∣ is the length OF = OC + CF = 22 + 22 + 1 = 8 + 1. √ √ The minimum value of ∣z∣ is the length ON = OC − CN = 22 + 22 − 1 = 8 − 1.

(b) Consider △CAN . The line through F , C, N , and the origin is y = x. So AN = CA. Moreover, CA2 + AN 2 = CN 2 = 12 = 1. Altogether then, CA2 + CA2 = 1 or CA2 =

1 1 N = (2 − √ , 2 − √ ). 2 2

1 1 1 or CA = √ . And AN = √ . Hence, 2 2 2

1 1 Symmetrically, F = (2 + √ , 2 + √ ). 2 2 Page 1049, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 168 (c) - (d). (Figure reproduced for convenience.) y

F |z - 2 - 2i | = 1

U A

C

N D

O Ʌ is the angle the line y = x makes with the positive x-axis.

x

(c) The points U and D at which arg z is maximised and minimised are also where the tangents OU and OD from the origin touch the circle. By the properties of the circle, OU is perpendicular to CU . Similarly, OD is perpendicular to CD. π The angle the upper half of the line y = x makes with the positive x-axis is θ = . The 4 1 1 π angle ∠COU is sin−1 √ . Hence, arg U = θ + ∠COU = + sin−1 √ . 4 8 8

π 1 − sin−1 √ . 4 8 √ √ (d) △ODC is right. So OD2 + CD2 = OC 2 . OC = 8 and CD = 1. Hence, OD = 8 − 1 = √ √ π 1 7. Altogether then, ∣D∣ = 7 and arg D = − sin−1 √ . 4 8 √ π 1 Symmetrically, ∣U ∣ = 7 and arg U = + sin−1 √ . 4 8 Symmetrically, arg D = θ − ∠COD = θ − ∠COU =

Page 1050, Table of Contents

www.EconsPhDTutor.com

89.9

Answers for Ch. 44: De Moivre’s Theorem

Answer to Exercise 169. Step #1. Let P(k) stand for the proposition that (cos θ + i sin θ) = cos (kθ) + i sin (kθ) . k

Our goal is to show that P(k) is true for all k = 1, 2, 3, . . .

Step #2. Verify that P(1) is true.

(cos θ + i sin θ) = cos (1 × θ) + i sin (1 × θ) . ✓ 1

Step #3. Show that P(j) implies P(j + 1) (for all j = 1, 2, 3, . . . ).

Assume that P(j) is true. That is,

(cos θ + i sin θ) = cos (jθ) + i sin (jθ) . j

Our goal is to show that P(j + 1) is true. That is, To this end, write

(cos θ + i sin θ)

(cos θ + i sin θ)

j+1

j+1

= cos [(j + 1) θ] + i sin [(j + 1) θ] .

= (cos θ + i sin θ) (cos θ + i sin θ) = [cos (jθ) + i sin (jθ)] (cos θ + i sin θ) = cos (jθ) cos θ + i cos (jθ) sin θ + isin (jθ) cos θ − sin (jθ) sin θ = cos [(j + 1) θ] + i sin [(j + 1) θ], j

as desired. (The last line uses trigonometric identities.)

Answer to Exercise 170. (a) ∣3 − 4i∣ = 5 and arg (3 − 4i) = tan−1

−4 ≈ −0.927. 3

So ∣(3 − 4i)7 ∣ = 57 and arg(3 − 4i)7 = 7 tan−1 (−4/3) + 2kπ ≈ −0.2079 (k = 1). So (3 − 4i)7 = 57 ei(−0.2079) . (b) ∣−5 + 12i∣ = 13 and arg (−5 + 12i) = tan−1 [12/(−5)] + π ≈ 1.966. So ∣(−5 + 12i)8 ∣ = 138 and arg(−5 + 12i)8 = 8 (tan−1

(−5 + 12i)8 = 138 ei(−3.125) . Page 1051, Table of Contents

12 + π) + 2kπ ≈ −3.125 (k = −3). So −5 www.EconsPhDTutor.com

Answer to Exercise 171. (a) ∣−1 − i∣ =

√

2 = 20.5 and arg (−1 − i) = tan−1

1 − π = −0.75π. 1

So ∣z 10 ∣ = 25 and arg z 10 = 10 × (−0.75π) + 2kπ = −7.5π + 2kπ = 0.5π (k = 4). Altogether then, in polar and exponential forms, z 10 = 25 [cos (0.5π) + i sin (0.5π)] , z 10 = 25 eiπ(0.5) .

Even without a calculator, we know that cos (0.5π) = 0 and sin (0.5π) = 1, and so in standard form, z 10 = 25 i = 32i. (Alternatively, you may recognise that since eiπ = −1, we must have eiπ(0.5) = i. Thus, 25 eiπ(0.5) = 25 i = 32i.) (b) ∣2 + i∣ =

√ 1 5 = 50.5 and arg (2 + i) = tan−1 . 2

1 So ∣z 10 ∣ = 55 and arg z 10 = 10 × tan−1 + 2kπ ≈ 4.636 + 2kπ ≈ −1.647 (k = −1). Altogether 2 then, in polar and exponential forms, z 10 = 55 [cos (−1.647) + i sin (−1.647)] , z 10 = 55 eiπ(−1.647) .

1 For the standard form, just punch into your calculator 55 × cos (10 tan−1 ) = −237 and 2 1 55 × sin (10 tan−1 ) = −3116 to get z 10 = −237 − 3116i. 2 (c) ∣1 − 3i∣ =

√ −3 10 = 100.5 and arg (1 − 3i) = tan−1 . 1

So ∣z 10 ∣ = 105 and arg z 10 = 10 tan−1 (−3) + 2kπ ≈ −12.490 + 2kπ ≈ 0.0759 (k = 2). Altogether then, in polar and exponential forms, z 10 = 105 [cos (0.0759) + i sin (0.0759)] , z 10 = 105 eiπ(0.0759) .

For the standard form, just punch into your calculator 105 × cos (10 tan−1 (−3)) = 99712 and 105 × sin (10 tan−1 (−3)) = 7584 to get z 10 = z 10 = 99712 + 7584i.

Page 1052, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 172. (a) z 10 = −1 − i has modulus

√

1 −π = 1 and arguments

2 and argument tan−1

−0.75π. So the roots of the equation z 10 = −1 − i have modulus 21/20 −0.75π + 2kπ , for k = 0, ±1, ±2, ±3, ±4, 5. Altogether then, . 10 z = 21/20 ei(−0.75π+2kπ)/10 , for k = 0, ±1, ±2, ±3, ±4, 5.

√ 1 3 and argument tan−1 . So the roots of the equation z 11 = 2+i 2 −1 tan (1/2) + 2kπ , for k = 0, ±1, ±2, ±3, ±4, ±5. Altogether have modulus 31/22 and arguments 11 then, (b) z 11 = 2+i has modulus

z = 31/22 ei[tan

−1

(1/2)+2kπ]/11

, for k = 0, ±1, ±2, ±3, ±4, ±5.

√ 10 and argument tan−1 (−3/1). So the roots of the equation tan−1 (−3) + 2kπ 12 1/24 z = 1−3i have modulus 10 and arguments , for k = 0, ±1, ±2, ±3, ±4, ±5, 6. 12 (c) z 12 = 1 − 3i has modulus

z = 101/24 ei[tan

Page 1053, Table of Contents

−1

(−3)+2kπ]/12

, for k = 0, ±1, ±2, ±3, ±4, ±5, 6.

www.EconsPhDTutor.com

90 90.1

Answers to Exercises in Part V: Calculus

Answers for Ch. 45: Solving Problems Involving Differentiation To the given equation x2 y + sin x = 0, apply

d to get dx dy 2xy + cos x dx x2 2 dy 2xy + x + cos x = 0. So = − (for x ≠ 0). And = − (for dx dx x2 dy 2xy + cos x 2xy + cos x ≠ 0). Answer to Exercise 173.

dx d , we could also have applied to the given equation: dy dy dx dx x2 dx 2 =− . 2x y + x + cos x = 0 and so again, dy dy dy 2xy + cos x Alternative Method. To find

Answer to Exercise 174. Given x = cos t + t2 and y = et − t3 , we may compute

− sin t + 2t and

dy et − 3t2 dy = et − 3t2 . So = (for − sin t + 2t ≠ 0). dt dx − sin t + 2t

dx = dt

Answer to Exercise 175. Given x = t5 +t and y = t4 −t, we have t = 0 Ô⇒ (x, y) = (0, 0). And t = 1 Ô⇒ (x, y) = (2, 0). Compute

dy dy dx 4t3 − 1 = ÷ = 4 (for 5t4 + 1 ≠ 0). dx dt dt 5t + 1

And so at t = 0,

While at t = 1 y = (x − 2) = 2

dy 4t3 − 1 = = −1 and so the tangent line at t = 0 has equation y = −x. dx 5t4 + 1

dy 4t3 − 1 3 1 1, = 4 = = and so the tangent line at t = 1 has equation dx 5t + 1 6 2 1 x − 1. 2

1 2 And so at their intersection, we have −x = x − 1 or x = . So their intersection point is 2 3 2 2 ( , − ). 3 3

Page 1054, Table of Contents

www.EconsPhDTutor.com

1 Answer to Exercise 176. (a) The volume is fixed as 1 = πr2 h. So r = 3 (b) By the Pythagorean Theorem, l =

√

r2 + h2 =

√

√

3 . πh

3 + h2 . πh

(c) The total external surface area of the cone (including the base) is A = πrl = π

√

3 πh

√

3 + h2 = π πh

√

9 3h + = π 2 h2 π

√

9 + 3πh. h2

−18 dA 3 π − h63 3 π − h63 dA 6 1/3 h3 + 3π (d) Compute = √ = √ = . So = 0 ⇐⇒ h = ( ) . dh 2 9 + 3πh 2 9 + 3πh 2 A dh π h2 h2

(e) Use the quotient rule: 3 = dh2 2

d2 A

18 h4 A − (π

− h63 ) dA dh

A2

(f) =

3 = 2

18 h4 A − (π

− h63 ) 23 A2

π− h63 A

9 h4 A2 − (π − h3 ) = 4 A3 6

12

2

6 2 12 9 6 2 12 2 A − (π − 3 ) = 4 ( 2 + 3πh) − (π − 3 ) h4 h h h h

108 36π 36 12π 72 24π 2 + − (π + − ) = + 3 − π2. 6 3 6 3 6 h h h h h h

1 This is a ∪-shaped quadratic in 3 , whose determinant is (24π)2 − 4(72)(−π 2 ) = 864π 2 > 0. h So this expression is always positive. d2 A d2 A (g) The numerator of our expression for is always positive. So is always positive. dh2 dh2 dA That is, is always strictly increasing. So the stationary point we found in (d) must also dh be the global minimum point.

Page 1055, Table of Contents

www.EconsPhDTutor.com

90.2

Answers for Ch. 46: Maclaurin Series

Answer to Exercise 177. (a) Given f ∶ R → R defined by x ↦ (1 + x)n , we have f (0) = 1, f ′ (x) = n(1 + x)n−1 , f ′ (0) = n, f ′′ (x) = n(n − 1)(1 + x)n−2 , f ′′ (0) = n(n − 1), f (3) (x) = n(n − 1)(n − 2)(1 + x)n−3 , and f (3) (0) = n(n − 1)(n − 2). Thus, M3 (x) = 1 + nx + n(n − 1) 2 n(n − 1)(n − 2) 3 x + x. 2! 3!

(b) Given g ∶ R → R defined by x ↦ sin x, we have g(0) = 0, g ′ (x) = cos x, g ′ (0) = 1, x3 g ′′ (x) = − sin x, g ′′ (0) = 0, g (3) (x) = − cos x, and g (3) (0) = −1. Thus, M3 (x) = x − . 3! (c) Given h ∶ R → R defined by x ↦ cos x, we have h(0) = 1, h′ (x) = − sin x, h′ (0) = 0, x2 h′′ (x) = − cos x, h′′ (0) = −1, h(3) (x) = sin x, and h(3) (0) = 0. Thus, M3 (x) = 1 − . 2!

1 , i′ (0) = 1, 1+x 1 x2 x3 1 ′′ (3) (3) , i (0) = −1, i (x) = 2 , and i (0) = 2. Thus, M (x) = x− + . i′′ (x) = − 3 (1 + x)2 (1 + x)3 2 3 (d) Given i ∶ R → R defined by x ↦ ln(1 + x), we have i(0) = 0, i′ (x) =

Page 1056, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 178. sin 0 = 0. And M0 (0) = 0, M1 (0) = 0, M2 (0) = 0, and indeed Mn (0) = 0 for all n. So it does appear “plausible” that sin 0 = M (0). π π π It does appear “plausible” that sin ( ) = M ( ), because sin = 1 and ... 2 2 2 π M0 ( ) = 0, 2

π π M3 ( ) = M4 ( ) 2 2

π π M5 ( ) = M6 ( ) 2 2 π π M7 ( ) = M8 ( ) 2 2

π Mn ( ) ≈ 1.000, 2

π π π M1 ( ) = M2 ( ) = ≈ 1.571, 2 2 2 π (π/2)3 = − ≈ 0.925, 2 3! = =

π (π/2)3 (π/2)5 − + ≈ 1.004, 2 3! 5!

π (π/2)3 (π/2)5 (π/2)7 − + − ≈ 1.000, 2 3! 5! 7!

for all n ≥ 7.

It does appear “plausible” that sin (π) = M (π), because sin π = 0 and ... M0 (π) = 0,

M3 (π) = M4 (π)

M1 (π) = M2 (π) = π ≈ 3.141, (π)3 =π− ≈ −2.026, 3!

(π)3 (π)5 + ≈ 0.524 3! 5!

M5 (π) = M6 (π)

=π−

M9 (π) = M10 (π)

(π)3 (π)5 (π)7 (π)9 + − + ≈ 0.007 =π− 3! 5! 7! 9!

M7 (π) = M8 (π)

M9 (π) = M10 (π)

Mn (π) ≈ 0.000, Page 1057, Table of Contents

=π−

=π−

(π)3 (π)5 (π)7 + − ≈ −0.075 3! 5! 7!

(π)3 (π)5 (π)7 (π)9 π 11 + − + − ≈ 0.000 3! 5! 7! 9! 11!

for all n ≥ 13.

www.EconsPhDTutor.com

Answer to Exercise 179. f (0) = sin 0 cos 0 = 0.

f ′ (x) = cos x cos x − sin x sin x = cos2 x − sin2 x = 1 − 2 sin2 x. So f ′ (0) = 1. f ′′ (x) = −4 sin x cos x = −4f (x). So f ′′ (0) = 0.

f (3) (x) = −4f ′ (x). So f (3) (0) = −4f ′ (0) = −4.

−4 3 2 x = x − x3 . This is indeed consistent with our finding in 3! 3 2 3 Example 440 that sin x cos x = x − x + . . . . 3 And so indeed M3 (x) = x +

Answer to Exercise 180. f (0) = 0.

sin x , so f ′ (0) = 0. 1+x

f ′ (x) = cos x ln(1 + x) +

f ′′ (x) = − sin x ln(1 + x) +

1 + 1 = 2.

cos x (1 + x) cos x − sin x cos x sin x + = −f (x) + 2 − , so f ′′ (0) = 2 2 1+x (1 + x) 1 + x (1 + x)

− sin x(1 + x) − cos x (1 + x)2 cos x − 2(1 + x) sin x f (x) = −f (x)+2 − , so f (3) (0) = −f ′ (0)− 2 4 (1 + x) (1 + x) 2 − 1 = −3. (3)

′

Hence, M3 (x) = 0 + 0 +

2 2 −3 3 1 x + x = x2 − x3 . 2! 3! 2

[ln (x + 1)] + Answer to Exercise 181. For ln x ∈ R, we have sin [ln (x + 1)] = ln (x + 1)− 3! x2 x3 . . . . For x ∈ (−1, 1], we have ln (x + 1) = x − + − . . . . Hence, for x ∈ (−1, 1] (this is the 2 3 range of values for which the Maclaurin series for sin [ln(1 + x)] converges), we have 3

sin [ln (x + 1)] = (x −

x2 2

+

x3 3

− ...) −

3

(x − x2 + x3 − . . . ) 2

3

3!

+ ...

x2 1 x2 x3 3 1 =x− + x ( − ) + ⋅⋅⋅ = x − + + ... 2 3 3! 2 6

Page 1058, Table of Contents

www.EconsPhDTutor.com

90.3

Answers for Ch. 47: The Indefinite Integral

Answer to Exercise 182. (a) F ′ (x) = 4 sin 4x, so that indeed F is an indefinite integral for f . And: G′ (x) = 8 (2 sin x cos3 x − 2 sin3 x cos x) = 16 sin x cos x (cos2 x − sin2 x) = 8 sin 2x cos 2x = 4 sin 4x,

so that indeed G is an indefinite integral for f .

(b) Although F and G seem to be very different functions, they actually differ only by a constant (namely 1), as we now show: G(x) = 8 sin2 x cos2 x = 2 (2 sin x cos x) (2 sin x cos x) = 2 sin2 2x = 1 − (1 − 2 sin2 2x) = 1 − cos 4x = 1 + F (x).

So this example does not contradict the assertion that “the indefinite integral is unique up to a constant”.

Page 1059, Table of Contents

www.EconsPhDTutor.com

90.4

Answers for Ch. 48: Integration Techniques

Answer to Exercise 183. Similarly,

d (kx + C) = x for all x, so by definition, ∫ k dx = kx + C. dx

d xn+1 ( + C) = xn dx n + 1

d (− cos x + C) = sin x dx

d x (e + C) = ex dx

Ô⇒

xn+1 ∫ x dx = n + 1 + C,

✓

Ô⇒

∫ sin x dx = − cos x + C,

✓

Ô⇒

n

x x ∫ e dx = e + C,

d (sin x + C) = cos x dx

Ô⇒

∫ cos x dx = sin x + C,

d [kF (x) + C] = kf (x) dx

Ô⇒

∫ kf (x) dx = kF (x) + C.

✓

✓

d [F (x) ± G(x) + C] = f (x) ± g(x) Ô⇒ ∫ f (x) ± g(x) dx = F (x) ± G(x) + C, ✓ dx

Page 1060, Table of Contents

✓

www.EconsPhDTutor.com

Answer to Exercise 184. (b) By Corollary 2,

d 1 sin−1 x = √ (for ∣x∣ < 1). Hence, dx 1 − x2

d x 1 1 1 [sin−1 ( ) + C] = √ =√ . 2 − x2 dx a x 2a a 1 − (a)

1 x So indeed ∫ √ dx = sin−1 ( ) + C (for ∣x∣ < a). a a2 − x2 (d) Let x ≠ a. Case #1:

a+x ≥ 0. a−x

d 1 a+x d 1 a+x 1 d ( ln ∣ ∣ + C) = ( ln + C) = [ln(a + x) − ln(a − x)] dx 2a a−x dx 2a a − x 2a dx =

1 1 1 1 a − x + (a + x) 1 2a 1 ( + )= = = , 2a a + x a − x 2a (a + x) (a − x) 2a a2 − x2 a2 − x2

1 a+x 1 dx = ln ∣ ∣ + C. so that indeed ∫ 2 a − x2 2a a−x Case #2:

a+x < 0. a−x

d 1 a+x d 1 a+x 1 d ( ln ∣ ∣ + C) = ( ln + C) = [ln(a + x) − ln(x − a)] dx 2a a−x dx 2a x − a 2a dx =

1 1 1 1 1 1 1 , ( − )= ( + )= 2 2a a + x x − a 2a a + x a − x a − x2

1 1 a+x so that indeed ∫ 2 dx = ln ∣ ∣ + C. a − x2 2a a−x (... Answer continued on the next page ...)

Page 1061, Table of Contents

www.EconsPhDTutor.com

(... Answer continued from the previous page ...) Answer to Exercise 184. (f) Let x not be an integer multiple of π, so that sin x ≠ 0. Case #1: sin x ≥ 0.

d d cos x (ln ∣sin x∣ + C) = (ln sin x + C) = = cot x, dx dx sin x

so that indeed ∫ cot x dx = ln ∣sin x∣ + C.

Case #2: sin x < 0.

d d − cos x (ln ∣sin x∣ + C) = [ln (− sin x) + C] = = cot x, dx dx − sin x

so that again ∫ cot x dx = ln ∣sin x∣ + C.

(... Answer continued on the next page ...)

Page 1062, Table of Contents

www.EconsPhDTutor.com

(... Answer continued from the previous page ...) Answer to Exercise 184. (g) Let x not be an integer multiple of π, so that csc x + cot x is well-defined. Case #1: csc x + cot x ≥ 0.

d d − csc x cot x − csc2 x (− ln ∣csc x + cot x∣ + C) = [− ln (csc x + cot x) + C] = − dx dx csc x + cot x =

csc x(cot x + csc x) = csc x, csc x + cot x

so that indeed ∫ csc x dx = − ln ∣csc x + cot x∣ + C. Case #2: csc x + cot x < 0.

d d csc x cot x + csc2 x (− ln ∣csc x + cot x∣ + C) = [− ln (− csc x − cot x) + C] = − dx dx − csc x − cot x =

csc x(cot x + csc x) = csc x, csc x + cot x

so that again ∫ csc x dx = − ln ∣csc x + cot x∣ + C. (... Answer continued on the next page ...)

Page 1063, Table of Contents

www.EconsPhDTutor.com

(... Answer continued from the previous page ...)

Answer to Exercise 184. (h) Let x not be an odd-integer multiple of π/2, so that sec x + tan x is well-defined. Case #1: sec x + tan x ≥ 0.

d d sec x tan x + sec2 x (ln ∣sec x + tan x∣ + C) = [ln (sec x + tan x) + C] = dx dx sec x + tan x =

sec x(sec x + tan x) = sec x, sec x + tan x

so that indeed ∫ sec x dx = − ln ∣sec x + tan x∣ + C. Case #2: sec x + tan x < 0. Then

d d − sec x tan x − sec2 x (ln ∣sec x + tan x∣ + C) = [ln (− sec x − tan x) + C] = dx dx − sec x − tan x =

sec x(sec x + tan x) = sec x, sec x + tan x

so that again ∫ sec x dx = − ln ∣sec x + tan x∣ + C.

Page 1064, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 185. (b) Recall that cos 2x = 2 cos2 x − 1. Thus, 2 ∫ cos x dx = ∫

cos 2x + 1 1 sin 2x dx = x + + C. 2 2 4

(c) Recall that tan2 x = sec2 x − 1 and ∫ sec2 x dx = tan x + C. Thus, 2 2 ∫ tan x dx = ∫ sec x − 1 dx = tan x + x + C.

P +Q P −Q P +Q P −Q cos . So let mx = and nx = . 2 2 2 2 So P = (m + n)x and Q = (m − n)x. Thus, (d) Recall that sin P + sin Q = 2 sin

Ô⇒

sin(mx) cos(nx) =

1 {sin [(m + n)x] + sin [(m − n)x]} 2

1 cos [(m + n)x] cos [(m − n)x] + } + C. ∫ sin(mx) cos(nx) dx = − 2 { m+n m−n

P −Q P +Q P −Q P +Q sin . So let mx = and nx = . 2 2 2 2 So P = (m + n)x and Q = (m − n)x. Thus, (e) Recall that cos P − cos Q = −2 sin

Ô⇒

1 sin(mx) sin(nx) = − {cos [(m + n)x] − cos [(m − n)x]} 2

1 sin [(m − n)x] sin [(m + n)x] − } + C. ∫ sin(mx) sin(nx) dx = 2 { m−n m+n

P +Q P −Q P +Q P −Q cos . So let mx = and nx = . 2 2 2 2 So P = (m + n)x and Q = (m − n)x. Thus, (f) Recall that cos P + cos Q = 2 cos

Ô⇒

cos(mx) cos(nx) =

1 {cos [(m + n)x] + cos [(m − n)x]} 2

1 sin [(m − n)x] sin [(m + n)x] + } + C. ∫ cos(mx) cos(nx) dx = 2 { m−n m+n

Page 1065, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 186. (a) (i) Note that the given substitution x = 3 sec u implies x dx u = sec−1 ( ) and = 3 sec u tan u. Now, 3 du ∫

=∫

=∫ 1

=∫

9 9 1 √ √ √ dx = ∫ dx = ∫ dx x2 x2 − 9 9 sec2 u 9 sec2 u − 9 3 sec2 u sec2 u − 1

1 1 √ dx = ∫ dx 3 sec2 u tan u 3 sec2 u tan2 u

1 dx 1 du + C = 3 sec u tan u du + C1 1 ∫ 3 sec2 u tan u du 3 sec2 u tan u

1 x du + C1 = ∫ cos u du + C1 = sin u + C = sin (sec−1 ( )) + C. sec u 3

where = uses Theorem 9 (“multiply by

du = 1”). du dx 3 9 3 implies = and u = 1− . (a) (ii) Note that the given substitution x = √ du 2(1 − u)3/2 x2 1−u 1

∫

√

9

x2 x2 − 9

dx = ∫ 1

x2

√

3 du + C1 = ∫ x2 − 9 2(1 − u)3/2 9

9 1−u

9 √

9 1−u

√ 3 1 √ du + C du + C = u+C = =∫ √ = 1 1 ∫ 2 u 9 − 9(1 − u) 2(1 − u)3/2 (1 − u)3/2

3 du + C1 3/2 2(1 − u) −9 √

1−

9 + C. x2

where = uses Theorem 9 (“multiply by

du = 1”). du (a) (iii) Let y = Hypothenuse / Adjacent. Fix “Adjacent” = 1, so that “Hypothenuse” = y √ √ √ and “Opposite” = Hypothenuse2 − Adjacent2 = y 2 − 12 = y 2 − 1. So 1

Opposite sin (sec−1 y) = = Hypothenuse

√

y2 − 1 = y

Applying the above result to our present context, we have

√

1−

1 . y2

√ ¿ 1 9 x Á À1 − sin (sec−1 ( )) = Á = 1 − 2. 2 3 x (x) 3

Indeed, the answers in (i) and (ii) are exactly identical! Page 1066, Table of Contents

www.EconsPhDTutor.com

3 2x dx 3 2 tan u implies u = tan−1 ( ) and = sec u. Now, 2 3 du 2

(b) (i) x =

∫ =∫

x3

(4x2 + 9)

3/2

dx = ∫

( 23 tan u)

3

( 23 tan u)

3

[4 ( 23

tan u) + 9] 2

3/2

dx = ∫

( 23 tan u)

3

(9 tan u + 9) 2

3/2

dx

1 tan3 u 1 dx = dx = sin3 u dx ∫ ∫ 3 3/2 8 sec u 8 (9 sec2 u)

1 1 3 sin3 u 3 dx 3 3 2 = ∫ sin u du + C1 = ∫ sin u sec u du + C1 = du + C1 8 du 8 2 16 ∫ cos2 u 1

=

3 1 − cos2 u 3 sin u sin u du + C1 = − sin u du + C1 ∫ ∫ 2 16 cos u 16 cos2 u

⎡ ⎤ ⎥ 1 3 ⎢⎢ 3 1 −1 2x ⎥ + C, ( + cos u) + C = + cos (tan ( )) = ⎢ ⎥ 16 cos u 16 ⎢ cos (tan−1 ( 2x 3 ⎥ )) 3 ⎣ ⎦

where = uses Theorem 9 (“multiply by 1

(b) (ii) u = 4x2 + 9 implies x = ( ∫

x3

(4x2 + 9)

dx = ∫ 3/2

du = 1”). du

du u − 9 1/2 1 1/2 ) = (u − 9) and = 8x. Now, 4 2 dx

1 u − 9 1 du u−9 1 8x dx = dx = du + C1 ∫ ∫ 3/2 4u3/2 8 dx 32u3/2 (4x2 + 9) 8 x2

1 −0.5 1 4x2 + 18 0.5 −0.5 (2u + 18u ) + C = u (u + 9) + C = √ + C, = 32 16 16 4x2 + 9

where = uses Theorem 9 (“cancel out dx’s”). 1

(... Answer continued on the next page ...)

Page 1067, Table of Contents

www.EconsPhDTutor.com

(... Answer continued from the previous page ...) (b) (iii) Let y = √ Opposite / Adjacent. Fix “Adjacent” = 1, so that “Opposite” = y and √ √ “Hypothenuse” = Adjacent2 + Opposite2 = 12 + y 2 = 1 + y 2 . So cos (tan−1 y) =

1 Adjacent =√ . Hypothenuse 1 + y2

Applying the above result to our present context, we have cos (tan−1 (

1 2x )) = √ . 3 2x 2 1+( 3 )

We now show that the answers in (i) and (ii) are exactly identical! ⎤ ⎡√ ⎡ ⎤ 2 ⎥ ⎢ ⎢ ⎥ 3 ⎢ 2x 1 3 ⎢ 1 ⎥ −1 2x ⎥ ⎥ ⎢ 1+( ) + √ + cos (tan ( ))⎥ = ⎢ −1 2x 16 ⎢ cos (tan ( 3 )) 3 ⎥ 16 ⎢⎢ 3 2x 2 ⎥ 1 + ( 3 ) ⎥⎦ ⎣ ⎦ ⎣

2x 2x 3 1+( 3 ) +1 3 2+( 3 ) 18 + 4x2 . = = √ = √ √ 2 16 16 2x 2 2x 2 16 9 + 4x 1+( 3 ) 1+( 3 ) 2

2

Answer to Exercise 187. choose v ′ = sin x. So

By the DETAIL rule of thumb, we should in both cases

v′

©¬ ′ ∫ x sin x dx = uv − ∫ u v dx = −x cos x + ∫ cos x dx = −x cos x + sin x. u

v′

v u ©¬ © ¬ ′ 2 2 ∫ x sin x dx = uv − ∫ u v dx = −x cos x + ∫ 2 x cos x dx u

′

= −x2 cos x + 2 (x sin x − ∫ sin x dx)

= −x2 cos x + 2 (x sin x + cos x) + C = (2 − x2 ) cos x + 2x sin x + C. Page 1068, Table of Contents

www.EconsPhDTutor.com

90.5

Answers for Ch. 49: The Fundamental Theorems of Calculus

Answer to Exercise 188. For the Lower Sum SL12 , each rectangle has width (or base) 0.5. The first rectangle has height f (0), the second f (0.5), the third f (1), ..., the twelfth f (5.5). And so

√ √ ⎡ ⎤ ⎞ ⎞⎥⎥ ⎛ 11 ⎛ 1 1 11 1 ⎢⎢ √ 1 SL12 = [f (0) + f ( ) + ⋅ ⋅ ⋅ + f ( )] = ⎢( 0 + 1) + + 1 + ⋅⋅⋅ + + 1 ⎥ ≈ 15.116. 2 2 2 2⎢ 2 ⎠ ⎠⎥ ⎝ ⎝ 2 ⎣ ⎦

For the Upper Sum SL12 , each rectangle again has width (or base) 0.5. The first rectangle has height f (0.5), the second f (1), the third f (1.5), ..., the twelfth f (6). And so √ ⎡ √ ⎤ √ ⎥ ⎞ ⎛ 3 ⎞ 1 1 ⎢⎢⎛ 1 1 +1 + + 1 + ⋅ ⋅ ⋅ + ( 6 + 1)⎥⎥ ≈ 16.341. SU 12 = [f ( ) + f (1) + ⋅ ⋅ ⋅ + f (6)] = ⎢ 2 2 2 ⎢⎝ 2 ⎠ ⎝ 2 ⎠ ⎥ ⎣ ⎦

Altogether then, lower and upper bounds for A(6) are:

15.116 ≈ SL12 ≤ A(6) = 15.79795897... ≤ SU 12 ≈ 16.341.

Answer to Exercise 189. We do not know which the area function is, amongst the infinitely-many indefinite integrals of f . We merely know that the area function is one of them. Hence, we use the indefinite article an, rather than the definite article the.

Page 1069, Table of Contents

www.EconsPhDTutor.com

90.6

Answers for Ch. 50: Definite Integrals

Answer to Exercise 190. Our desired area is labelled A below. Method #1. The entire rectangle A + B + C + D has area 21/3 × 2 = 24/3 . The rectangle B + C 1/3 4 2 21/3 x 24/3 − 1 has area 1 × 1 = 1. The region D has area ∫ x3 dx = [ ] = . Hence, 4 1 4 1 A = A + B + C + D − (B + C + D) = 2

Method #2. y = x3 ⇐⇒ x = y 1/3 . So A = ∫

y=2 y=1

4/3

3 24/3 − 1 ) = (24/3 − 1) . − (1 + 4 4

x dy = ∫

2 1

y 1/3 dy =

3 4/3 2 3 4/3 [y ]1 = (2 − 1). 4 4

y y=2 A y=1 D

B C

x

Page 1070, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 191. and x = 5π/6. 5π/6

A=∫ π/6 √ π 3− . 3

The curve y = sin x and the line y = 0.5 intersect at x = π/6

√ √ 3 3 π 5π π 5π π 5π/6 sin x−0.5 dx = [− cos x − 0.5x]π/6 = − cos +cos − + = − (− )+ − = 6 6 12 12 2 2 3

√ Answer to Exercise 192. By the quadratic formula, the two curves intersect at ± 2/2. So A=∫

√

2/2

√ − 2/2

√ √ √ √ 2 2 2 2 2 2 2 2 2 − x2 − (x2 + 1) dx = [x − ] √ =[ − ] − [− + ]= . 3 − 2/2 2 12 2 12 3

Page 1071, Table of Contents

2x3

√

2/2

√

www.EconsPhDTutor.com

Answer to Exercise 193. Compute 2

∫−2

x5 32 −32 256 x − 16 dx = [ − 16x] = ( − 32) − ( + 32) = − . 5 5 5 5 −2 2

4

Hence the desired area is 256/5.

y x A

Answer to Exercise 194. (Again, it helps to graph this on your calculator.) Note that √ √ 3 3 y = 1 ⇐⇒ t = 2; y = 2 ⇐⇒ t = 3; and dy/dt = 3t2 . So the area can be computed as: √ t= 3 3

∫y=1 x dy = ∫t= √ 3 2 y=2

=[

√ 3

3

3t5 6t4 2 2 (t + 2t) 3t dt = [ + ] 3 5 4 √ 2

3 ⋅ 35/3 6 ⋅ 34/3 3 ⋅ 25/3 6 ⋅ 24/3 38/3 − 3 ⋅ 25/3 37/3 − 3 ⋅ 24/3 + + + . ]−[ ]= 5 4 5 4 5 2

Answer to Exercise 195. By Fact 61, ∫0 πy dx = ∫0 π

Page 1072, Table of Contents

2

π

1 sin 2x π π 2 π sin x dx = π [ x − ] = . 2 4 0 2 2

www.EconsPhDTutor.com

90.7

Answers for Ch. 51: Differential Equations

Answer to Exercise 196. We’ll need to use IBP twice: ⎛ ³¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹· ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ µ⎞ ©x ¬ © ¬ ⎜ ⎟ y = ∫ e sin x dx = ex sin x − ∫ cos x ex dx = ex sin x − ⎜ex cos x + ∫ sin xex dx⎟ ⎜ ⎟ ⎝ ⎠ ex y = (sin x − cos x) + C is the general solution. 2 v′

⇐⇒

u

u

v′

y

e0 Given also the initial condition x = 0 Ô⇒ y = 1, we find that 1 = (sin 0 − cos 0) + C = 2 ex C − 0.5 ⇐⇒ C = 1.5. Thus, the particular solution is y = (sin x − cos x) + 1.5. 2 Answer to Exercise 197.

Rearranging,

dx 1 = 2 . So the general solution is dy y +1

1 x = ∫ 2 dy = tan−1 y + C (Proposition 10). Rearranging, the general solution is y +1 y = tan (x + D) (where D = −C).

Given also the initial condition x = 0 Ô⇒ y = 1, we have C = −π/4. So the particular solution is x = tan−1 y − π/4.

Page 1073, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 198. We’ll need to use IBP twice:

dy dx ⇐⇒

dy dx

Ô⇒

dy/dx ⎛ ³¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹· ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ µ⎞ ©¬ ⎜ x ⎟ ¬© 1 x ⎟ = ∫ ex sin x dx = ex sin x − ∫ cos x ex dx = ex sin x − ⎜ ⎜e cos x + ∫ sin xe dx⎟ ⎜ ⎟ ⎝ ⎠ ex = (sin x − cos x) + C1 2 v′

y=∫ =

u

u

v′

dy ex dx = ∫ (sin x − cos x) + C1 dx dx 2

1 ex [ (sin x − cos x) + C2 − ∫ ex cos x dx + C1 x] 2 2

1 ex ex = [ (sin x − cos x) + C2 − (sin x + cos x) + C3 + C1 x] 2 2 2 2

=

1 (−ex cos x + C1 x + C4 ) , 2

where = used =. The general solution is 2

1

dy 1 = (−ex cos x + C1 x + C4 ). dx 2

Given also the initial condition x = 0 Ô⇒ y = 1 and x = π/2 Ô⇒ y = 2, we have 1 (−e0 cos 0 + C1 0 + C4 ) Ô⇒ C4 = 2.5, 2 1 π π 2 = (−eπ/2 cos + C1 ⋅ + 2.5) Ô⇒ C1 = 1.5π. 2 2 2 1=

So the particular solution is

Page 1074, Table of Contents

dy 1 = (−ex cos x + 1.5πx + 2.5). dx 2

www.EconsPhDTutor.com

Answer to Exercise 199. some constant.

(a) The Law of Gravitation is F =

(b) (i) Newton’s Second Law of Motion is that F =

GM m , where G ∈ R is r2

d (mv). dt

dm dv d (mv) = v + m . Assuming that m is constant, we (ii) By the Product Rule, F = dt dt dt dm dv have = 0 and hence F = m . dt dt

(c) Taking the Earth is immobile, the force of gravitation is the rate of change of momentum of the small ball. That is, F=

dv GM m = −m . r2 dt

The ball drops towards the surface of the Earth at an increasing speed. By assumption, downwards is the negative direction. Hence the negative sign. Cancelling out the m’s yields

(d) (i)

(ii)

GM dv =− . 2 r dt

Gm1 1 R ∫R+x r2 dr = Gm1 [− r ] R+x R

r=R dr dv ∫r=R+x dt dr = ∫r=R+x dt dv r=R

1 1 vs2 (iii) Gm1 (− + )=− ⇐⇒ R R+x 2

= Gm1 (− =∫ vs = ±

v=vs v=0

√

1 1 + ). R R+x

v2 s vs2 v dv = [ ] = − . 2 0 2 v

2Gm1 (

1 1 − ). R R+x

By √assumption, downwards is the negative direction. So for (d) (iii), we must have vs = 1 1 − 2Gm1 ( − ). R R+x (... Answer continued on the next page ...)

Page 1075, Table of Contents

www.EconsPhDTutor.com

(... Answer continued from the previous page ...)

(e) This is simply the same process as before, but in reverse. The ball will keep moving upwards, but the force of gravitation will keep pulling it down, reducing its velocity at a Gm1 dv rate given by the equation 2 = − . Eventually, the velocity of the ball will hit 0 and r dt then start going negative (i.e. the ball will start falling down towards the Earth). √ 1 1 Hence, if x is the maximum height attained by the ball, we have V = 2GM ( − ). R R+x (f) In order for the ball to never fall back down to earth, it must be that the ball keeps going upwards and never reaches any maximum height. That is, x → ∞. Thus, ve = lim V = x→∞ √ √ 1 1 2GM 2GM ( − )= . R R+x R (g) The escape velocity is ve =

√

2GM = R

√

2 ⋅ 6.674 × 10−11 ⋅ 5.972 × 1024 ≈ 11, 190. 6371000

Hence, the escape velocity is approximately 11.19 km s-1 .

Page 1076, Table of Contents

www.EconsPhDTutor.com

d2 y dy x2 dy Answer to Exercise 200. Given 2 , we have = ∫ x dx = + C1 and y = ∫ dx = dx dx 2 dx x3 + C1 x + C2 . 6 Given also the initial condition x = 0 Ô⇒ y = 1, we find that C2 = 1. Hence, the general x3 + C1 x + 1. solution is y = 6

Sketched below are five members of the family of solution curves, specifically where x = ±2, ±1, 0.

y

x

Page 1077, Table of Contents

www.EconsPhDTutor.com

91

Answers to Exercises in Part VI: Probability and Statistics 91.1

Answers for Ch. 52: How to Count: Four Principles

Answer to Exercise 201. Taking the green path, there are 3 ways. Taking the red path, there are 2 ways. Hence, there are 3 + 2 = 5 ways to get from the Starting Point to the River.

Answer to Exercise 202. The tree diagram below illustrates. Case #1. First letter is a D. Case #1(i). Second letter is a D. Then the last two letters must both be E’s. (1 permutation.) Case #1(ii). Second letter is an E. Then the last two letters must be either DE or ED. (2 permutations.) Case #2. First letter is a E. Case #2(i). Second letter is an E. Then the last two letters must both be D’s. (1 permutation.) Case #2(ii). Second letter is a D. Then the last two letters must be either DE or ED. (2 permutations.) Altogether then, there are 1 + 2 + 1 + 2 = 6 possible permutations of the letters in DEED.

Page 1078, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 203. 3 × 5 × 10 = 150. Answer to Exercise 204. We must choose three 4D numbers. Choosing the first 4D number involves four decisions — what to put as the first, second, third and fourth digits, with the condition that no digit is repeated. ____ 1 2 3 4 Thus, by the MP, there are 10 × 9 × 8 × 7 = 5040 ways to choose the first 4D number.

If we ignored the fact that we already chose the first 4D number, then there’d similarly be 5040 ways to choose the second 4D number (given the condition that this second 4D number does not have any repeated digits). However, there is an additional condition — namely, the second 4D number cannot be the same as the first. Thus, there are 5040 − 1 = 5039 ways to choose the second 4D number. By similar reasoning, we see that there are 5040 − 2 = 5038 ways to choose the third 4D number.

Altogether then, by the MP, there are 5040 × 5039 × 5038 = 127, 947, 869, 280 ways to choose the three 4D numbers.

Page 1079, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 205. Apply the IEP twice. 1. The food court and hawker centre share 2 types of cuisine (Chinese and Western) in common. And so together, the food court and the hawker centre have 4 + 3 − 2 = 5 different types of cuisine.

2. Combine together the food court and the hawker centre (call this the “Low-Class Place”). The Low-Class Place has 5 types of cuisine and shares 2 types of cuisine (Chinese and Malay) with the restaurant. And so together, the Low-Class Place and restaurant have 5 + 3 − 2 = 6 different types of cuisine (namely Chinese, Indonesian, Japanese, Korean, Malay, and Western).

Answer to Exercise 206. 10 − 3 = 7. (Can you name them?)

Page 1080, Table of Contents

www.EconsPhDTutor.com

91.2

Answers for Ch. 53: How to Count: Permutations

Answer to Exercise 207. 6! = 720, 7! = 5040, and 8! = 40320.

Answer to Exercise 208. 7!/ (4!3!) = 35. Answer to Exercise 209. 9!.

Answer to Exercise 210. The problem of choosing a president and vice-president from a committee of 11 members is equivalent to the problem of filling 2 spaces with 11 distinct objects. The answer is thus P (11, 2) = 11!/9! = 11 × 10 = 110. Answer to Exercise 211. Let B and S stand for brother and sister, respectively. (a) First consider the problem of permuting the seven letters in BBBBSSS, without any two B’s next to each other. There is only 1 possible arrangement, namely BSBSBSB. There are 4! ways to permute the brothers and 3! ways to permute the sisters. Hence, there are in total 1 × 4!3! = 144 possible ways to arrange the siblings in a line, so that no two brothers are next to each other. (b) First consider the problem of permuting the seven letters in BBBBSSS, without any two S’s next to each other. We’ll use the AP. 1. B in position #1. (a) B in position #2. Then the only way to fill the remaining five positions is SBSBS. Total: 1 possible arrangement. (b) S in position #2. Then we must have B in position #3. i. B in position #4. Then the only way to fill the remaining three positions is SBS. Total: 1 possible arrangement. ii. S in position #4. Then we must have B in position #5. And there are two ways to fill the remaining two positions: either BS or SB. Total: 2 possible arrangements. (... Answer to Exercise 211 continued on the next page ...)

Page 1081, Table of Contents

www.EconsPhDTutor.com

(... Answer to Exercise 211 continued from the previous page ...) 2. S in position #1. Then we must have B in position #2. (a) B in position #3. Then, like in 1(b), we are left with two B’s and two S’s to fill the remaining four positions. Hence, Total: 3 possible arrangements. (b) S in position #3. Then we must have B in position #4. There are three ways to fill the remaining three positions: SBB, BSB, and BBS. Total: 3 possible arrangements. By the AP, there are 1 + 1 + 2 + 3 + 3 = 10 possible arrangements.

Again, there are 4! ways to permute the brothers and 3! ways to permute the sisters. Hence, there are in total 10 × 4!3! = 1440 possible ways to arrange the siblings in a line, so that no two sisters are next to each other. (c) We saw that there was only 1 possible (linear) permutation of BBBBSSS that satisfied the restriction, namely BSBSBSB. If we now arrange the siblings in a circle, there will necessarily be two brothers next to each other. We thus conclude: There are 0 possible ways to arrange the siblings in a circle so that no two brothers are next to each other.

(d) In part (b), we found 10 possible (linear) permutations of BBBBSSS that satisfied the restriction. Of these, 3 have sisters at the two ends: SBSBBBS, SBBSBBS, and SBBBSBS. If arranged in a circle, these 3 arrangements would involve two sisters next to each other. So we must deduct these 3 arrangements. We are left with 7 possible arrangements: BBSBSBS, SBBSBSB, BSBBSBS, SBSBBSB, BSBSBBS, SBSBSBB, and BSBSBSB. But of course, these are simply one and the same fixed circular permutation! (This is consistent with Fact 64, which tells us to simply divide by 7.) And now again, we must now take into account the fact that the brothers are distinct and the sisters are distinct. We conclude that there are in total 1 × 4!3! = 144 possible ways to arrange the siblings in a circle, so that no two sisters are next to each other.

Page 1082, Table of Contents

www.EconsPhDTutor.com

91.3

Answers for Ch. 54: How to Count: Combinations

Answer to Exercise 212. ⎛n⎞ n! = ⎝ k ⎠ k!(n − k)! = =

n × (n − 1) × ⋅ ⋅ ⋅ × (n − k + 1) × (n − k) × (n − k − 1) × ⋅ ⋅ ⋅ × 1 k!(n − k) × (n − k − 1) × ⋅ ⋅ ⋅ × 1 n × (n − 1) × (n − 2) × ⋅ ⋅ ⋅ × (n − k + 1) k!

(mass cancellation).

Answer to Exercise 213. C(4, 2) =

4! 4! 4×3 = = 2!(4 − 2)! 2!2! 2 × 1

C(7, 3) =

7! 7! 7×6×5 = = = 35. 3!(7 − 3)! 3!4! 3 × 2 × 1

C(6, 4) =

Answer to Exercise 214.

Answer to Exercise 215.

6! 6! 6×5 = = 4!(6 − 4)! 4!2! 2 × 1

= 6,

= 15,

⎛ 3 ⎞⎛ 7 ⎞⎛ 5 ⎞ = 630. ⎝ 1 ⎠⎝ 2 ⎠⎝ 2 ⎠ (a) C(1, 0) + C(1, 1) = 1 + 1 = 2 = C(2, 1).

(b) C(4, 2) + C(4, 3) = 3 + 3 = 6 = C(5, 3). (c) C(17, 2) + C(17, 3) =

17! 17 × 16 17 × 16 × 15 17! + = + 2!15! 3!14! 2×1 3×2×1

= 17 × 8 + 17 × 8 × 5 = 17 × 8 × 6 =

Page 1083, Table of Contents

18 × 17 × 16 . 3×2×1

www.EconsPhDTutor.com

Answer to Exercise 216. ⎛7⎞ ⎛7⎞ ⎛7⎞ = 21, = 7, = 1. ⎝5⎠ ⎝6⎠ ⎝7⎠

⎛7⎞ ⎛7⎞ ⎛7⎞ ⎛7⎞ ⎛7⎞ = 35, = 35, = 21, = 7, = 1, ⎝4⎠ ⎝3⎠ ⎝2⎠ ⎝1⎠ ⎝0⎠

Answer to Exercise 217. Expanding, we have (1 + x)3 = (1 + x)(1 + x)(1 + x) = 1 ⋅ 1 ⋅ 1 + 1 ⋅ 1 ⋅ x + 1 ⋅ x ⋅ 1 + x ⋅ 1 ⋅ 1 + 1 ⋅ x ⋅ x + x ⋅ 1 ⋅ x + x ⋅ x ⋅ 1 + x ⋅ x ⋅ x. ´¹¹ ¹ ¸′ ¹ ¹ ¹ ¶ ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶ ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¸′ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶ ´¹¹ ¹ ¹ ¸′ ¹ ¹ ¹ ¹ ¶ 1 x 0 xs 2 xs 3 xs

Consider the 6 terms on the right. There is C(3, 0) = 1 way to choose 0 of the x’s. Hence, the coefficient on x0 is C(3, 0) — this corresponds to the term 1 ⋅ 1 ⋅ 1 above. There are C(3, 1) = 3 ways to choose 1 of the x’s. Hence, the coefficient on x1 is C(3, 1) — this corresponds to the terms 1 ⋅ 1 ⋅ x, 1 ⋅ x ⋅ 1, and x ⋅ 1 ⋅ 1 above. There are C(3, 2) = 3 ways to choose 2 of the x’s. Hence, the coefficient on x2 is C(3, 2) — this corresponds to the terms 1 ⋅ x ⋅ x, x ⋅ 1 ⋅ x, and x ⋅ x ⋅ 1 above. There is C(3, 03) = 1 way to choose 3 of the x’s. Hence, the coefficient on x3 is C(3, 3) — this corresponds to the term x ⋅ x ⋅ x above. Altogether then,

(1 + x)3 =

⎛3⎞ 0 ⎛3⎞ 1 ⎛3⎞ 2 ⎛3⎞ 3 x + x + x + x = 1 + 3x + 3x2 + x3 . ⎝0⎠ ⎝1⎠ ⎝2⎠ ⎝3⎠

Answer to Exercise 218. 27 = 128.

⎛7⎞ ⎛7⎞ ⎛7⎞ ⎛7⎞ + + + ⋅⋅⋅ + = 1 + 7 + 21 + 35 + 35 + 21 + 7 + 1 = 128. ⎝0⎠ ⎝1⎠ ⎝2⎠ ⎝7⎠

So indeed, 27 =

⎛7⎞ ⎛7⎞ ⎛7⎞ ⎛7⎞ + + + ⋅⋅⋅ + . ⎝0⎠ ⎝1⎠ ⎝2⎠ ⎝7⎠

Page 1084, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 219. (3 + x)4 =

⎛4⎞ 4 0 ⎛4⎞ 3 1 ⎛4⎞ 2 2 ⎛4⎞ 1 3 ⎛4⎞ 4 4 3x + 3x + 3x + 3x + 3x ⎝1⎠ ⎝2⎠ ⎝3⎠ ⎝4⎠ ⎝0⎠

= 81 + 4 ⋅ 27x + 6 ⋅ 9x2 + 4 ⋅ 3x3 + x4 = 81 + 108x + 54x2 + 12x3 + x4 .

Answer to Exercise 220. and

(a) There are

⎛4⎞ = 4 ways of choosing the two Tan sons ⎝2⎠

⎛3⎞ = 3 ways of choosing the two Wong daughters. ⎝2⎠

Having chosen these sons and daughters, there are only 2! = 2 × 1 possible ways of matching them up. This is because for the first chosen Tan Son, we have 2 possible choices of brides for him. And then for the second chosen Tan Son, there is only 1 possible choice of bride left for him. Altogether then, there are

⎛ 4 ⎞⎛ 3 ⎞ ⋅ 2 = 24 ways of forming the two couples. ⎝ 2 ⎠⎝ 2 ⎠

⎛6⎞ ⎛9⎞ = 6 ways of choosing the five Lee sons and = 126 ways of choosing ⎝5⎠ ⎝5⎠ the five Ho daughters. (b) There are

Having chosen these sons and daughters, there are 5! = 5 × 4 × 3 × 2 × 1 possible ways of matching them up. This is because for the first chosen Tan Son, we have 5 possible choices of brides for him. And then for the second chosen Tan Son, there are 4 possible choices of brides left for him. Etc. Altogether then, there are couples.

Page 1085, Table of Contents

⎛ 6 ⎞⎛ 9 ⎞ ⋅ 5! = 6 ⋅ 126 ⋅ 5! = 90, 720 ways of forming the five ⎝ 5 ⎠⎝ 5 ⎠

www.EconsPhDTutor.com

91.4

Answers for Ch. 55: Probability: Introduction

Answer to Exercise 221(a).

(i) The appropriate sample space is

S = {A«, K«, Q«, . . . , 2«, Aª, Kª, Qª, . . . , 2ª, A©, K©, Q©, . . . , 2©, A¨, K¨, Q¨, . . . , 2¨} . (a) (ii) Since there are 52 possible outcomes, there are 252 possible events. Hence, the event space contains 252 elements. It is too tedious to write this out explicitly. (a) (iii) As always, P has domain Σ and R. We have P({3©}) = P({5♣}) = 1/52 and P({3©, 5♣}) = 2/52. In general, given any event A ∈ Σ, we have P(A) =

∣A∣ ∣A∣ = . ∣S∣ 52

In words, given any event A, its probability P(A) is simply the number of elements it contains, divided by 52. So for example, P ({3©, 5♣, A«}) = 3/52, as we would expect. (a) (iv) John might argue that since packs of poker cards usually come with Jokers, there is the possibility that we mistakenly included one or more Jokers in our deck of cards. He might thus argue that to cover this possibility, we should set our sample space to be S = {A«, K«, , . . . , 2«, Aª, Kª, . . . , 2ª, A©, K©, . . . , 2©, A¨, K¨, . . . , 2¨, Joker} .

The event space would be appropriately adjusted to contain 253 elements.

The mapping rule of the probability function would be appropriately adjusted, based on John’s belief of the probability of selecting a Joker. For example, if he reckons that the probability of selecting a Joker is 1/10, 000, then he might assign P ({Joker}) = 1/10, 000 and for any other card C, P ({C}) = 9999/(10000 ⋅ 52). The probability of any other event A ∈ Σ is as given by the Additivity Axiom.

Page 1086, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 221(b). (i) The appropriate sample space is S = {HH, HT, T H, T T }. (b) (ii) Since there are 4 possible outcomes, there are 24 = 16 possible events. Hence, the event space contains 16 elements. It is not too tedious to write these out explicitly: ⎧ ⎪ ⎪ Σ = ⎨∅, {HH} , {HT } , {T H} , {T T } , {HH, HT } , {HH, T H} , {HH, T T } , ⎪ ⎪ ⎩ {HT, T H} , {HT, T T } , {T H, T T } , {HH, HT, T H} , ⎫ ⎪ ⎪ {HH, HT, T T } , {HH, T H, T T } , {HT, T H, T T } , S ⎬. ⎪ ⎪ ⎭

(b) (iii) As always, P has domain Σ and R. We have P({HH}) = P({HT }) = 1/4 and P({HT, HT, T H}) = 3/4. In general, given any event A ∈ Σ, we have P(A) =

∣A∣ ∣A∣ = . ∣S∣ 4

In words, given any event A, its probability P(A) is simply the number of elements it contains, divided by 4. So for example, P ({T H, T T }) = 2/4, as we would expect. (b) (iv) John might, as before, argue that there is the possibility that a coin lands on its edge. He might thus argue that the sample space should be S = {HH, HT, HX, T H, T T, T X, XH, XT, XX} .

The event space would be appropriately adjusted to contain 29 = 512 elements.

The mapping rule of the probability function would be appropriately adjusted. For example, if John believes that any given coin flip has probability 1/6000 of landing on its edge, then we might assign P ({XX}) = 1/60002 , P ({HH}) = (5999/6000)2 , P ({XH}) = (1/6000) ⋅ (5999/6000), etc.

Page 1087, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 221(c). ⎧ ⎪ ⎪ S=⎨ ⎪ ⎪ ⎩

(i) The appropriate sample space contains 36 outcomes:

,

,...,

,

,...,

,

,...,

⎫ ⎪ ⎪ ⎬. ⎪ ⎪ ⎭

(c) (ii) Since there are 36 possible outcomes, there are 236 possible events. Hence, the event space contains 236 elements. ⎧ ⎫ ⎧ ⎛⎪ ⎛⎪ ⎪⎞ ⎪ ⎪ ⎪ (c) (iii) As always, P has domain Σ and R. We have P ⎨ ⎬ = P ⎨ ⎪⎠ ⎝⎪ ⎝⎪ ⎪ ⎪ ⎩ ⎪ ⎭ ⎩ ⎧ ⎫ ⎪ ⎪ ⎛⎪ ⎪⎞ 2 P ⎨ , ⎬ = . In general, given any event A ∈ Σ, we have ⎪ ⎝⎪ ⎠ 36 ⎪ ⎪ ⎩ ⎭ P(A) =

⎫ ⎪ ⎪⎞ 1 ⎬ = and ⎪ 36 ⎠ ⎪ ⎭

∣A∣ ∣A∣ = . ∣S∣ 36

In words, given any event A, its probability P(A) is simply the number of elements it ⎧ ⎫ ⎪ ⎛⎪ ⎪ ⎪⎞ 4 , , contains, divided by 52. So for example, P ⎨ , ⎬ = , as we would expect. ⎪ ⎝⎪ ⎠ 36 ⎪ ⎪ ⎩ ⎭ (c) (iv) John might argue that there is the possibility that a die lands on a vertex. He might thus argue that the sample space contains 72 = 49 outcomes and should be ⎧ ⎪ ⎪ S=⎨ ⎪ ⎪ ⎩

,

,...,

V

,

,

,...,

V

,

V

,

V

V ,..., V

⎫ ⎪ ⎪ ⎬. ⎪ ⎪ ⎭

The event space would be appropriately adjusted to contain 249 elements.

The mapping rule of the probability function would be appropriately adjusted. For example, if John believes that any given die roll has probability 1/1000000 of landing on a vertex, ⎧ ⎫ ⎧ ⎫ ⎛⎪ ⎛⎪ 1 999999 2 ⎪V ⎪ ⎪⎞ ⎪ ⎪ ⎪⎞ then we might assign P ⎨ ⎬ = , P ⎨ ⎬ = ( ) , etc. ⎪V ⎪ ⎪⎠ 10000002 ⎪ ⎪ ⎪⎠ 1000000 ⎝⎪ ⎝⎪ ⎩ ⎭ ⎩ ⎭

Page 1088, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 222. (a) By definition, A and B are mutually exclusive if A∩B = ∅. Since P(∅) = 0, the result follows.

(b) The events A, Ac ∩ B, and Ac ∩ B c ∩ C are mutually exclusive. Moreover, their union is A ∪ B ∪ C. Hence, by the Additivity Axiom (applied twice), P(A ∪ B ∪ C) = P(A) + P (Ac ∩ B) + P (Ac ∩ B c ∩ C) .

Page 1089, Table of Contents

www.EconsPhDTutor.com

91.5

Answers for Ch. 56: Conditional Probability

Answer to Exercise 223. Let A be the event that we rolled at least one even number and B be the event that the sum of the two dice was 8. We have P(B) = 5/36 (see Exercise 229). And A ∩ B can occur if and only if the two dice were 3/36.

,

, or

Altogether then,

P(A∣B) =

. Hence, P(A ∩ B) =

P(A ∩ B) 3/36 3 = = . P(B) 5/36 5

Answer to Exercise 224. It may be true that P (DNA match∣Blood stain is not John Brown’s) =

1 . 10, 000, 000

P (Blood stain is not John Brown’s∣DNA match) =

1 . 10, 000, 000

It does not however follow, except by the CPF, that

There is reason to believe that P (Blood stain is not John Brown’s) is much greater than P (DNA match) and thus that P (Blood stain is not John Brown’s∣DNA match) is much greater than P (DNA match∣Blood stain is not John Brown’s).

One important factor is that if the DNA database is large, then invariably we’d expect to find, purely by coincidence, a DNA match to the blood stain at the murder scene. As of May 2016, the US National DNA Index contains over the DNA profiles of over 12.3 million 1 individuals. And so, even if it were true that there is only probability that two 10, 000, 000 random individuals have a DNA match, we’d expect to find a match, simply by combing through the entire US National DNA Index! The error here is similar to the lottery example, where we conclude (erroneously) that a lottery winner must have cheated, simply because it was so unlikely that she won.

Page 1090, Table of Contents

www.EconsPhDTutor.com

91.6

Answers for Ch. 57: Probability: Independence

Answer to Exercise 225. By Fact 70, A, B are independent events ⇐⇒ P(A∣B) = P(A). Rearranging, P(B) = P(A ∩ B)/P(A) = P(B∣A), as desired. Answer to Exercise 226. First, note that P (H1 ) = P (T1 ) = P (H2 ) = 0.5.

(a) P (H1 ∩ H2 ) = 0.25 = 0.5 × 0.5 = P (H1 ) P (H2 ), so that indeed H1 and H2 are independent. (b) P (H2 ∩ T1 ) = 0.25 = 0.5×0.5 = P (H2 ) P (T1 ), so that indeed H2 and T1 are independent. (c) Observe that H1 ∩ T1 = ∅ (it is impossible that “the first coin flip is heads” AND also “the first coin flip is tails”).

Hence, P (H1 ∩ T1 ) = P (∅) = 0 ≠ 0.25 = 0.5 × 0.5 = P (H1 ) P (T1 ), so that indeed H1 and T1 are not independent.

Answer to Exercise 227. No, the journalist is incorrectly assuming that the probability of one family member making the NBA is independent of another family member making the NBA. But such an assumption is almost certainly false. The same excellent genes that made Rick Barry a great basketball player, probably also helped his three sons. Not to mention that having an NBA player as your father probably helps a lot too. The two events “family member #1 in NBA” and “family member #2 in NBA” are probably not independent. So we cannot simply multiply probabilities together.

Answer to Exercise 228. First, note that P (H1 ) = P (T2 ) = P(X) = 0.5.

(a) P (H1 ∩ T2 ) = 0.25 = 0.5 × 0.5 = P (H1 ) P (T2 ), so that indeed H1 , T2 are independent. P (H1 ∩ X) = 0.25 = 0.5 × 0.5 = P (H1 ) P (X), so that indeed H1 , X are independent. P (T2 ∩ X) = 0.25 = 0.5 × 0.5 = P (T2 ) P (X), so that indeed T2 , X are independent. Altogether then, H1 , T2 , and X are indeed pairwise independent.

(b) The event H1 ∩ T2 ∩ X is the same as the event H1 ∩ T2 . Thus, P (H1 ∩ T2 ∩ X) = P (H1 ∩ T2 ) = 0.25 ≠ 0.5 × 0.5 × 0.5 = P (H1 ) P (T2 ) P(X), so that indeed the three events are not independent.

Page 1091, Table of Contents

www.EconsPhDTutor.com

91.7

Answers for Ch. 59: Random Variables: Introduction

Answer to Exercise 229. k 2

(a)

s such that X(s) = k

P(X = k) 1 36 2 36

.

3

,

.

4

,

,

.

5

,

,

,

.

6

,

,

,

,

.

7

,

,

,

,

,

8

,

,

,

,

.

9

,

,

,

.

10

,

,

.

11

,

.

12

.

3 36 4 36 5 36 .

(b) E is the event X ≥ 10. (c) P(E) = P (X ≥ 10) = P (X = 10) + P (X = 11) + P (X = 12) =

Page 1092, Table of Contents

6 36 5 36 4 36 3 36 2 36 1 36

3 2 1 6 1 + + = = . 36 36 36 36 6

www.EconsPhDTutor.com

Answer to Exercise 230. P

Q

⎛ ⎝

(P Q)

⎞ ⎛ = 3 and Q ⎠ ⎝

⎛ ⎝

⎞ = 3. ⎠

⎞ ⎛ = 15 and (P Q) ⎠ ⎝

⎛ ⎝

⎞ ⎛ = 5 and P ⎠ ⎝

⎞ = 4. ⎠

⎞ = 12. ⎠

Answer to Exercise 231. If S = {1, 2, 3, 4, 5, 6}, then X ∶ S → R defined by X(s) = s is of course a random variable. A random variable is simply any function with domain S and codomain R; and X certainly meets these requirements.

P (X = 1) = P (X = 2) = P (X = 3) = P (X = 4) = P (X = 5) = P (X = 6) = 1/6 and P (X = K) = 0 for any k ≠ 1, 2, 3, 4, 5, 6. Answer to Exercise 232(a).

(i) The sample space is

S = {HHHH, HHHT, HHT H, HT HH, T HHH, HHT T, HT HT, T HHT, HT T H, T HT H, T T HH, HT T T, T HT T, T T HT, T T T H, T T T T }.

The event space Σ is the set of all possible subsets of S and contains 216 elements. The probability function P ∶ Σ → R is defined by P(A) = ∣A∣/16, for any event A ∈ Σ. (a) (ii) The random variable X ∶ S → R is the function defined by

HT T T, T HT T, T T HT, T T T H HHT T, HT HT, T HHT, HT T H, T HT H, T T HH HHHT, HHT H, HT HH, T HHH T T T T ↦ 0, HHHH

↦ 1, ↦ 2, ↦ 3, ↦ 4.

(a)(iii) P(X = 4) = 1/16, P(X = 3) = 4/16, P(X = 2) = 6/16, P(X = 1) = 4/16, P(X = 0) = 1/16, P(X = k) = 0, for any k ≠ 0, 1, 2, 3, 4. Page 1093, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 232(b). ⎧ ⎪ ⎪ ⎪ ⎪ S=⎨ ⎪ ⎪ ⎪ ⎪ ⎩

,

(i) The sample space S consists of 216 outcomes:

,...,

,

,...,

,

,...,

⎫ ⎪ ⎪ ⎪ ⎪ ⎬ ⎪ ⎪ ⎪ ⎪ ⎭

The event space Σ is the set of all possible subsets of S and contains 2216 elements. The probability function P ∶ Σ → R is defined by P(A) = ∣A∣/216, for any event A ∈ Σ.

(b) (ii) The range of X is {3, 4, 5, . . . , 18}. We now count the number of ways there are for the three dice to reach a sum of 3, to reach a sum of 4, etc. This will enable us to write down the mapping rule of the function X ∶ S → R. or permutations thereof. There is thus

To get a sum of 3, the three dice must be 3! = 1 possibility. 3!

To get a sum of 4, the three dice must be 3! = 3 possibilities. 2! To get a sum of 5, the three dice must be 3! 3! thus + = 6 possibilities. 2! 2!

To get a sum of 6, the three dice must be 3! 3! There are + 3! + = 10 such possibilities. 2! 3!

, or permutations thereof. There are thus ,

, or permutations thereof. There are ,

,

To get a sum of 7, the three dice must be , , 3! 3! 3! thereof. There are + 3! + + = 15 such possibilities. 2! 2! 2!

To get a sum of 8, the three dice must be , , , 3! 3! 3! thereof. There are + 3! + 3! + + = 21 such possibilities. 2! 2! 2!

, or permutations thereof. ,

, or permutations ,

, or permutations

To get a sum of 9, the three dice must be , , , , , 3! 3! 3! permutations thereof. There are 3! + 3! + + + 3! + = 25 such possibilities. 2! 2! 3!

, or

, , , , , , or To get a sum of 10, the three dice must be 3! 3! 3! permutations thereof. There are 3! + 3! + + 3! + + = 27 such possibilities. 2! 2! 2! By symmetry, there are also 27 ways to get a sum of 11; also 25 ways to get a sum of 12, etc. (... Answer continued on the next page ...) Page 1094, Table of Contents

www.EconsPhDTutor.com

(... Answer continued from the previous page ...) So X ∶ S → R is defined by

⎛ X⎜ ⎜ ⎝

⎞ ⎛ ⎟=X⎜ ⎟ ⎜ ⎠ ⎝

(b)(iii)

⎞ ⎛ ⎟=X⎜ ⎟ ⎜ ⎠ ⎝

⎛ X⎜ ⎜ ⎝

⎞ ⎛ ⎟=X⎜ ⎟ ⎜ ⎠ ⎝

⎞ ⎛ ⎟=X⎜ ⎟ ⎜ ⎠ ⎝ ⎞ ⎛ ⎟=X⎜ ⎟ ⎜ ⎠ ⎝

⎛ X⎜ ⎜ ⎝

⎞ ⎟ = 3, ⎟ ⎠

⎞ ⎛ ⎟=X⎜ ⎟ ⎜ ⎠ ⎝

⎞ ⎟ = 5, ⎟ ⎠

⎞ ⎛ ⎟=X⎜ ⎟ ⎜ ⎠ ⎝

⎞ ⎟ = 4, ⎟ ⎠

⋮

P(X = 3) =

1 , 216

P(X = 9) =

27 27 25 , P(X = 10) = , P(X = 11) = , 216 216 216

P(X = 15) =

10 6 3 , P(X = 16) = , P(X = 17) = , 216 216 216

P(X = 6) =

P(X = 12) =

P(X = 18) =

for any k ∉ {3, 4, 5, . . . , 18}.

Page 1095, Table of Contents

10 , 216

P(X = 4) = P(X = 7) =

3 , 216 15 , 216

P(X = 5) = P(X = 8) =

6 , 216 21 , 216

25 21 15 , P(X = 13) = , P(X = 14) = , 216 216 216

1 , 216

P(X = k) = 0,

www.EconsPhDTutor.com

91.8

Answers for Ch. 60: Random Variables: Independence

Answer to Exercise 233. No. For example, P (X = 0, Y = 0) = 0, but P (X = 0) P (Y = 0) = 0.5 × 0.25 = 0.125.

91.9

Answers for Ch. 61: Random Variables: Expectation

Answer to Exercise 234. (a) P(X + Y = 2) is simply the probability of 2 heads and 0 sixes OR 1 head and 1 six OR 0 heads and 2 sixes. So P (X + Y = 2) =

1 1 5 5 ⎛ 2 ⎞ 1 1 ⎛ 2 ⎞ 5 1 1 1 1 1 25 20 1 46 ⋅ ⋅ ⋅ + ⋅ ⋅ + ⋅ ⋅ ⋅ = + + = . 2 2 6 6 ⎝ 1 ⎠ 2 2 ⎝ 1 ⎠ 6 6 2 2 6 6 144 144 144 144

(b) P (X + Y = 3) is simply the probability of 2 heads and 1 six OR 1 head and 2 sixes. So P (X + Y = 3) =

2 12 1 1 ⎛ 2 ⎞ 5 1 ⎛ 2 ⎞ 1 1 1 1 10 ⋅ ⋅ ⋅ + ⋅ ⋅ ⋅ = + = . 2 2 ⎝ 1 ⎠ 6 6 ⎝ 1 ⎠ 2 2 6 6 144 144 144

(c) P (X + Y = 4) is simply the probability of 2 heads and 2 sixes. So

(d)

E[X + Y ]

P (X + Y = 4) =

1 1 1 1 1 ⋅ ⋅ ⋅ = . 2 2 6 6 144

=

P (X + Y = k) ⋅ k ∑ k∈Range(X+Y )

=

25 60 46 12 1 60 + 92 + 36 + 4 192 4 ⋅0+ ⋅1+ ⋅2+ ⋅3+ ⋅4= = = . 144 144 144 144 144 144 144 3

= P (X + Y = 0) ⋅ 0 + P (X + Y = 1) ⋅ 1 + P (X + Y = 2) ⋅ 2 + P (X + Y = 3) ⋅ 3 + P (X + Y = 4) ⋅ 4

Page 1096, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 235(a). The range of X consists simply of the possible prizes from the “big” game. Range(X) = {2000, 1000, 490, 250, 60, 0}. (Don’t forget to include 0.) Similarly, Range(Y ) = {3000, 2000, 800, 0}.

(b) P (X = 2000) = P (X = 1000) = P (X = 490) =

1 , 10000

P (X = 0) =

9977 , 10000

P (Y = 0) =

9997 . 10000

E[X] =

(c)

P (X = 250) = P (X = 60) =

10 , 10000

P (Y = 3000) = P (Y = 2000) = P (Y = 800) =

1 , 10000

P (X = k) ⋅ k = 2000P (X = 2000) + 1000P (X = 1000) + . . . ∑ k∈Range(X)

⋅ ⋅ ⋅ + 490P (X = 490) + 250P (X = 250) + 60P (X = 60) + 0P (X = 0) =

E[Y ] =

2000 1000 490 250 ⋅ 10 60 ⋅ 10 9977 ⋅ 0 + + + + + = 0.659 10000 10000 10000 10000 10000 10000 P (Y = 2000) ⋅ k ∑ k∈Range(Y )

= P (Y = 3000) ⋅ 3000 + P (Y = 2000) ⋅ 2000 + P (Y = 800) ⋅ 800 + P (Y = 0) ⋅ 0 =

1 1 1 9997 ⋅ 3000 + ⋅ 2000 + ⋅ 800 + ⋅ 0 = 0.3 + 0.2 + 0.08 + 0 = 0.58. 10000 10000 10000 10000

(d) For every $1 staked, the “big” game is expected to lose you $0.341 and the “small” game is expected to lose you $0.42. Thus, the “big” game is expected to lose you less money.

Page 1097, Table of Contents

www.EconsPhDTutor.com

91.10

Answers for Ch. 62: Random Variables: Variance

Answer to Exercise 236. In Exercise 232(b), we already found that P (Z = 3) = 1/216, P (Z = 4) = 3/216, ..., P (Z = 18) = 1/216. By symmetry, we have µ = E[Z] = 10.5. So: E [Z 2 ] =

1 3 1 25704 ⋅ 32 + ⋅ 42 + ⋅ ⋅ ⋅ + ⋅ 182 = = 119. 216 216 216 216 V[Z] = E [Z 2 ] − µ2 = 119 − 10.52 =

Hence,

Answer to Exercise 237. E [Y ] = V [Y ] =

SD [Y ] =

105 . 12

65 35 × 20 cm + × 30 cm = 26.5 cm. 100 100

35 65 2 2 × (20 cm − 26.5 cm) + × (30 cm − 26.5 cm) = 22.75 cm2 . 100 100

√

V [Y ] ≈ 4.77 cm.

Answer to Exercise 238.

(a) 2µ kg, 2σ 2 kg2 .

(b) 2µ kg, 4σ 2 kg2 . (c) The mean of the total weight of the two fish is 2µ kg. However, we do not know the variance, since the weights of the two fish are not independent.

91.11

Answers for Ch. 63: The Coin-Flips Problem

This chapter had no exercises.

91.12

Answers for Ch. 64: Bernoulli Trial and Distribution

This chapter had no exercises. Page 1098, Table of Contents

www.EconsPhDTutor.com

91.13

Answers for Ch. 65: Binomial Distribution

Answer to Exercise 239. Let X ∼ B (20, 0.01) be the number of components in engine #1 that fail. Let Y ∼ B (35, 0.005) be the number of components in engine #2 that fail. The probability that engine #1 fails is

P (X ≥ 2) = 1 − P (X ≤ 1) = 1 − P (X = 0) − P (X = 1)

⎛ 20 ⎞ ⎛ 20 ⎞ 0.010 0.9920 − 0.011 0.9919 ⎝ 1 ⎠ ⎝ 0 ⎠ ≈ 0.0169. =1−

The probability that engine #2 fails is

P (Y ≥ 2) = 1 − P (Y ≤ 1) = 1 − P (Y = 0) − P (Y = 1)

⎛ 35 ⎞ ⎛ 35 ⎞ 0.0050 0.99535 − 0.0051 0.99534 ⎝ 0 ⎠ ⎝ 1 ⎠ ≈ 0.0133. =1−

Hence, the probability that both engines fail is

P (X ≥ 2) P (Y ≥ 2) ≈ 0.00022.

Page 1099, Table of Contents

www.EconsPhDTutor.com

91.14

Answers for Ch. 66: Poisson Distribution

Answer to Exercise 240. (a) The rate at which cats are killed is probably not constant. There are probably periods of months or years when cat-killers are particularly active, and other periods when the cat-killers are either in jail or inactive. (Indeed, between 2011 and 2014, relatively few cats were killed in northern Singapore. However, during 2015-2016, there were unusually many cats killed in northern Singapore.) Thus, the Poisson random variable is not a suitable model for the number of cats killed in northern Singapore.

(b) It is reasonable to suppose that errors occur at a constant rate and that the author is no more or less likely to make an error, regardless of when his last error was committed. Thus, the Poisson random variable is arguably a suitable model for the number of errors in this textbook. On the other hand, one could argue that errors do not occur at a constant rate. It is conceivable that the author is sometimes tired while working, and is thus more error-prone during such occasions. Other times he is high on caffeine (and possibly other stimulants) and is thus less error-prone.

(c) The rate at which you receive emails is probably not constant. For example, most of your emails received are probably during the day, because that is when most people are awake. Thus, the Poisson random variable is not a suitable model for the number of emails you receive in a 24-hour timespan.

Answer to Exercise 241. P(X > 5)

= 1 − P(X ≤ 5) = 1 − [P(X = 0) + P(X = 1) + P(X = 2) + P(X = 3) + P(X = 4) + P(X = 5)] =1−e

−4.2

4.22 4.23 4.24 4.24 (1 + 4.2 + + + + ) ≈ 0.247. 2! 3! 4! 5!

Answer to Exercise 242. pY (0) = e−3.7 ≈ 0.0247. Page 1100, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 243. (a) 5500000 is the population of Singapore. 10−6 is the probability that a randomly chosen Singaporean is killed by lightning, in a given year. (b) Since n ≥ 30 and p ≤ 0.05, a suitable approximation to X is Y ∼ Po(np) = Po(5.5). The probability that at least 5 people are killed by lightning strikes is P(X ≥ 5) ≈ P(Y ≥ 5) = 1 − P(Y ≤ 4) ≈ 1 − 0.3575 = 0.6425, 1

where ≈ was obtained, either by reading off a Poisson table, using a graphing calculator, or manually doing the calculations on a calculator. 1

Answer to Exercise 244. The number of deaths by lightning strikes in Singapore can be modelled by S ∼ B (5500000, 10−6 ). Those in Malaysia can be modelled by M ∼ B (30000000, 10−7 ). Our goal is to find P(S + M ≥ 10).

Suitable approximations for S and M are X ∼ Po(5.5) and Y ∼ Po(3). And so a suitable approximation for S + M is X + Y ∼ Po(8.5). Hence, P(S + M ≥ 10) ≈ P(X + M ≥ 10) = 1 − P(X + M < 10) ≈ 0.347.

Page 1101, Table of Contents

www.EconsPhDTutor.com

91.15

Answers for Ch. 67: Continuous Uniform Distribution

Answer to Exercise 245. ⎧ ⎪ ⎪ 0, if k < 3, ⎪ ⎪ ⎪ ⎪ (a) FY (k) = ⎨0.5k, if k ∈ [3, 5], ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ if k > 5. ⎩1,

⎧ ⎪ ⎪ ⎪0.5, if k ∈ [3, 5] (b) fY (k) = ⎨ ⎪ ⎪ otherwise. ⎪ ⎩0,

(c) P (3.1 ≤ Y ≤ 4.6) = 0.75 is in blue and P (4.8 ≤ Y ≤ 4.9) = 0.05 is in red.

Page 1102, Table of Contents

www.EconsPhDTutor.com

91.16

Answers for Ch. 68: Normal Distribution

Answer to Exercise 246. (a) From Z-tables, P (Z ≥ 1.8) = 1 − P (Z ≤ 1.8) = 1 − Φ(1.8) ≈ 1 − 0.9641 = 0.0359.

Graphing calculator screenshot:

-4

-3

-2

-1

0

1

2

3

4

-4

-3

-2

-1

0

1

2

3

4

(b) From Z-tables, P (−0.351 < Z < 1.2) = Φ(1.2) − Φ(−0.351) = Φ(1.2) − [1 − Φ(0.351)] ≈ 0.8849 − (1 − 0.6372) = 0.8849 − 0.3628 = 0.5221.

Graphing calculator screenshot:

Page 1103, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 247. If µ = 0 and σ 2 = 1, then

a−µ 2 2 a−0 2 1 1 1 fX (a) = √ e−0.5( σ ) = √ e−0.5( 1 ) = √ e−0.5a = φ(a). σ 2π 1 2π 2π

We’ve just shown that the PDF of X ∼ N(µ, σ 2 ) when µ = 0 and σ 2 , is the same as the PDF of the SNRV Z ∼ N(0, 1). Hence, the SNRV is indeed simply a normal random variable with mean µ = 0 and variance σ 2 = 1.

Answer to Exercise 248. First observe that 1 −µ with a = and b = : σ σ

X − µ X −µ = + . Now simply use Fact 79, σ σ σ

X − µ X −µ µ −µ 1 2 = + ∼ N( + , σ ) = N (0, 1) . σ σ σ σ σ σ2

Page 1104, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 249. of X.

Let X ∼ N (µ, σ 2 ) and let fX and FX be the PDF and CDF

1. We know from Fact 78 that Φ (∞) = 1. And by Corollary 6, FX (∞) = P (X ≤ ∞) = Φ (∞). Thus, FX (∞) = 1. a−µ a−µ ). But we already know from Fact 78 that φ ( ) > 0. 2. fX (a) = φ ( σ σ 3. E [X] = E [σZ + µ] = σE [Z] + µ = µ. 4. We know from Fact 78 that the standard normal PDF φ attains a global maximum at 0. That is, φ(0) ≥ φ(a), for all a ∈ R. Since X = σZ + µ, this is equivalent to fX (σ ⋅ 0 + µ) ≥ fX (σ ⋅ a + µ), for all a ∈ R. Equivalently, fX (µ) ≥ fX (b), for all b ∈ R. That is, fX attains a global maximum at µ. 5. V [X] = V [σZ + µ] = σ 2 V [Z] = σ 2 . 6. We know from Fact 78 that for any a ∈ R, we have P (Z ≤ a) = P (Z < a). Equivalently, for all a ∈ R, P (X ≤ σa + µ) = P (X < σa + µ). Equivalently, for all b ∈ R, P (X ≤ b) = P (X < b).

7. We know from Fact 78 that φ is symmetric about 0. Since X = σZ + µ, fX must likewise be symmetric about σ ⋅ 0 + µ = µ.

a a a ). By Fact 78, P (Z ≥ ) = P (Z ≤ − ). σ σ σ a Now again by Corollary 6, P (Z ≤ − ) = P (X ≤ µ − a). Altogether then, P (X ≥ µ + a) = σ P (X ≤ µ − a), as desired. And of course, by definition, P (X ≤ µ − a) = FX (µ − a). (b) Obvious. (c) Obvious. (a) By Corollary 6, P (X ≥ µ + a) = P (Z ≥

8. First use P (−1 ≤ Z 9. First use P (−2 ≤ Z 10. First use P (−3 ≤ Z

Corollary 6: P (µ − σ ≤ X ≤ µ + σ) = P (−1 ≤ Z ≤ 1). Now use Fact 78: ≤ 1) = 0.6827. Corollary 6: P (µ − 2σ ≤ X ≤ µ + 2σ) = P (−2 ≤ Z ≤ 2). Now use Fact 78: ≤ 2) = 0.9545. Corollary 6: P (µ − 3σ ≤ X ≤ µ + 3σ) = P (−3 ≤ Z ≤ 3). Now use Fact 78: ≤ 3) = 0.9973.

11. By Fact 78, φ has two points of inflexion, namely at ±1. That is, φ changes concavity at ±1. Since by Corollary 6, X = σZ+µ, fX must likewise change concavity at ±1⋅σ+µ = µ±σ. That is, fX has two points of inflexion, namely at µ ± σ.

Page 1105, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 250. We are given that X ∼ N(2.14, 5) and Y ∼ N(−0.33, 2). (a) P (X ≥ 1) = P (Z ≥

1 − 2.14 √ ) ≈ P (Z ≥ −0.5098) 5

= P (Z ≤ 0.5098) = Φ (0.5098) ≈ 0.6949.

P (Y ≥ 1) = P (Z ≥

1 − (−0.33) √ ) ≈ P (Z ≥ 0.9405) 2

= 1 − P (Z ≤ 0.9405) = 1 − Φ (0.9405) ≈ 0.1735.

P (X ≥ 1) and P (Y ≥ 1)

(b) P (−2 ≤ X ≤ −1.5) = P (

P (−2 ≤ X ≤ −1.5) and P (−2 ≤ Y ≤ −1.5)

(−2) − 2.14 (−1.5) − 2.14 √ √ ≤Z≤ ) 5 5

≈ P (−1.8515 ≤ Z ≤ −1.6279) = P (1.6279 ≤ Z ≤ 1.8515)

= Φ (1.8515) − Φ (1.6279) ≈ 0.9679 − 0.9482 = 0.0197.

P (−2 ≤ Y ≤ −1.5) = P (

(−1.5) − (−0.33) (−2) − (−0.33) √ √ ≤Z≤ ) 2 2

≈ P (−1.1809 ≤ Z ≤ −0.8273) = P (0.8273 ≤ Z ≤ 1.1809)

= Φ (1.1809) − Φ (0.8273) ≈ 0.8812 − 0.7959 = 0.0853. Page 1106, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 251. (a) Let W ∼ N (25000, 64000000) and E ∼ N (200, 10000). Let B = 0.002W + 0.3E be the total bill in a given month. Then B ∼ N (0.002 × 25000 + 0.3 × 200, 0.0022 × 64000000 + 0.32 × 10000) = N (50 + 60, 256 + 900) = N (110, 1156) .

Thus, P (B > 100) ≈ 0.6157 (calculator).

(b) Let B1 ∼ N (110, 1156), B2 ∼ N (110, 1156), . . . , B12 ∼ N (110, 1156) be the bills in each of the 12 months.

Then the total bill in a year is T = B1 +B2 +⋅ ⋅ ⋅+B12 ∼ N (12 × 110, 12 × 1156) = N (1320, 13872). Thus, P (T > 1000) ≈ 0.9967 (calculator).

(c) The total bill in a given month is B = 0.002W + xE and

B ∼ N (50 + 200x, 256 + 10000x2 ) .

Our goal is to find the value of x for which P (B > 100) = 0.1. We have

100 − (50 + 200x) 50 − 200x P (B > 100) = P (Z > √ ) = P (Z > √ ) 256 + 10000x2 256 + 10000x2 50 − 200x ) = 0.1. = 1 − Φ (√ 256 + 10000x2

From the Z-tables,

Φ (√

50 − 200x

256 + 10000x2

) = 0.9

⇐⇒

50 − 200x √ ≈ 1.2815. 256 + 10000x2

One can rearrange, do the algebra (square both sides), and use the quadratic formula. Alternatively, one can simply use one’s graphing calculator to find that x ≈ 0.084. We conclude that the maximum value of x is approximately 0.084, in order for the probability that the total utility bill in a given month exceeds $100 is 0.1 or less.

Page 1107, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 252. 3.5 and variance 35/12.

From our earlier work, we know that each die roll has mean

The CLT says that since n = 30 ≥ 30 is large enough and the distribution is “nice enough” (we are assuming this), X can be approximated by the normal random variable Y ∼ N (30 × 3.5, 30 × 35/12) = N (105, 1050/12). Thus, using also the continuity correction, we have P(100 ≤ X ≤ 110) ≈ P(99.5 ≤ Y ≤ 110.5) ≈ 0.4435 (calculator). Answer to Exercise 253. Let X be the random variable that is the sum of the weights of the 5, 000 Coco-Pops. The CLT says that since n = 5000 ≥ 30 is large enough and the distribution is “nice enough” (we are assuming this), X can be approximated by the normal random variable Y ∼ N (5000 × 0.1, 5000 × 0.004) = N (500, 20). Thus, P (X ≤ 499) ≈ P (Y ≤ 499) ≈ 0.4115 (calculator).

Answer to Exercise 254. Assume that the probability that a student passes is independent of whether or not other students pass. Assume also that the distribution is “nice enough”, so that since n = 1000 ≥ 20 is “big enough”, we can use the CLT.

Let X be the number of passes. Then X can be approximated by the normal random variable Y ∼ N (1000 × 0.9, 1000 × 0.9 × 0.1) = N (900, 90). Thus, using also the continuity correction, we have P (X ≥ 920) ≈ P (Y ≥ 919.5) ≈ 0.0199 (calculator). (This turns out to be a decent approximation because the exact probability, computed using the binomial distribution, is P (X ≥ 920) ≈ 0.0176.)

Page 1108, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 255. Assume there are 365 days in a year. Assume the number of accidents in a day is independent of the numbers of accidents on other days. Let the number of motor vehicle fatalities in a year be Y = X1 + X2 + ⋅ ⋅ ⋅ + X365 , where X1 ∼ Po (0.5), X2 ∼ Po (0.5), . . . , X365 ∼ Po (0.5) are the numbers of murders in each of the 365 days.

The CLT says that since n = 365 ≥ 30 is large enough and the distribution is “nice enough” (we are assuming this), Y can be approximated by the normal random variable T ∼ N (365 × 0.5, 365 × 0.5) = N (182.5, 182.5). Thus, using also the continuity correction, we have P(Y > 200) ≈ P(T > 200.5) ≈ 0.0914.

This is a decent approximation, because the exact probability, computed using the Poisson distribution, is P(Y > 200) ≈ 0.0928.

Page 1109, Table of Contents

www.EconsPhDTutor.com

91.17

Answers for Ch. 71: Sampling

Answer to Exercise 256. x¯ = s2 =

3 + 14 + 2 + 8 + 8 + 6 + 0 41 = and 7 7

x2 9 + 156 + 4 + 64 + 64 + 36 − 412 /7 155 ∑ x2i − n¯ = = . n−1 6 7

Answer to Exercise 257.

(a) The sample mean and sample variance are

n ∑i=1 x 1885 x¯ = = = 188.5, n 10 (∑n i=1 x) n

∑i=1 x2 − 2 s = n−1 n

2

378, 265 − 1885 10 = ≈ 2550. 9 2

(b) The sample mean and sample variance are n n n n ∑i=1 x ∑i=1 (x − 50 + 50) ∑i=1 (x − 50) + ∑i=1 50 1885 + 50n 1885 = = = = + 50 = 238.5, x¯ = n n n n n

[∑ (x −50)] n 378, 265 − 1885 ∑i=1 (xi − 50) − i=1 ni 10 2 = ≈ 2550. s = n−1 9 2

n

2

2

Answer to Exercise 258. (a) Assume that the weights of the five Singaporeans sampled are independently- and identically-distributed. Then unbiased estimates for the population mean µ and variance σ 2 of the weights of Singaporeans are, respectively, the observed sample mean x¯ and observed sample variance s2 : ∑ xi 32 + 88 + 67 + 75 + 56 = = 63.6, n 5 x2 322 + 882 + 672 + 752 + 562 − 4 × 63.6 ∑ x2i − n¯ s2 = = = 448.3. n−1 4 x¯ =

(b) We don’t know! And unless we literally gather and weigh every single Singaporean, we will never know what exactly the average weight of a Singaporean is. All we’ve found in part (a) is an estimate (63.6 kg) for the average weight of a Singaporean. We know that on average, the estimator we uses “gets it right”. However, it could well be that we’re unlucky (and got 5 unusually heavy or unusually light persons) and the estimate of 63.6 kg is thus way off. Page 1110, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 259. ¯ = E [ X1 + X2 + ⋅ ⋅ ⋅ + Xn ] = E [X1 + X2 + ⋅ ⋅ ⋅ + Xn ] E [X] n n E [X1 ] + E [X2 ] + ⋅ ⋅ ⋅ + E [Xn ] µ + µ + ⋅ ⋅ ⋅ + µ nµ = = = = µ. n n n

¯ = µ. In other words, we’ve just shown that X ¯ is an unbiased We have just shown that E [X] estimator for µ.

Answer to Exercise 260. (a) The observed random sample is (x1 , x2 , . . . , x10 ) = (1, 1, 1, 1, 1, 1, 1, 0, 0, 0). The observed sample mean and observed sample variance are x¯ =

x1 + x2 + ⋅ ⋅ ⋅ + x10 = 0.7, n

⋅ 7 ⋅ 0.32 + 3 ⋅ 0.72 (x1 − x¯) + (x2 − x¯) + ⋅ ⋅ ⋅ + (x10 − x¯) = = 0.23. s = n−1 9 2

2

2

2

(b) Yes, the observed sample mean x¯ = 0.7 is an unbiased estimate for the true population mean µ (i.e. the true proportion of coin flips that are heads). ⋅

And yes, the observed sample variance s2 = 0.23 is an unbiased estimate for the true population variance σ 2 . (c) No, this is merely one observed random sample, from which we generated a single estimate (“guess”) — namely x¯ = 0.7 — of the true population mean µ.

¯ is an unbiased estimator for the true population All we know is that the sample mean X ¯ will equal µ. mean µ. That is, the average estimate generated by X However, any particular estimate x¯ may or may not be equal to µ. Indeed, if we’re unlucky, our particular estimate may be very far from the true µ.

¯ = V [ 1 (X1 + X2 + ⋅ ⋅ ⋅ + Xn )] = 1 V [X1 + X2 + ⋅ ⋅ ⋅ + Xn ] = Answer to Exercise 261. V [X] n n2 1 1 σ2 2 (nσ ) (V [X ] + V [X ] + ⋅ ⋅ ⋅ + V [X ]) = = . 1 2 n n2 n2 n Page 1111, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 262.

(a) The population mean µ is the number defined by

k

µ = ∑ xi /k. It is the average across all population values. i=1

k

(b) The population variance σ 2 is the number defined by σ 2 = ∑ (xi − µ) /k. It measures i=1

the dispersion across the population values.

n

¯ is a random variable defined by X ¯ = ∑ Xi /n. It is the average (c) The sample mean X i=1

of all values in a random sample.

¯ / (n − 1). (d) The sample variance S 2 is a random variable defined by S 2 = ∑ (Xi − X) n

It measures the dispersion across the values in a random sample.

i=1

(e) The mean of the sample mean, also called the expected value of the sample mean, is the ¯ The interpretation is that if we we have infinitely-many observed samples number E [X]. ¯ is equal to the average of size n, calculate the observed sample mean for each, then E [X] ¯ = µ and hence that the across the observed sample means. It can be shown that E [X] ¯ is an unbiased estimator for the population mean µ. sample mean X ¯ The interpretation is that if (f) The variance of the sample mean is the number V [X]. we have infinitely-many observed random samples of size n, calculate the observed sample ¯ measures the dispersion across the observed sample means. mean for each, then V [X]

(g) The mean of the sample variance, also called the expected value of the sample variance, is the number E [S 2 ]. The interpretation is that if we have infinitely-many observed random samples of size n, calculate the observed sample variance for each, then E [S 2 ] is equal to the average across the observed sample variances. It can be shown that E [S 2 ] = σ 2 and hence that the sample variance S 2 is an unbiased estimator for the population variance σ2 . (h) Given an observed random sample, e.g. (x1 , x2 , x3 ) = (1, 1, 0), we can calculate the corresponding observed sample mean as x¯ =

x1 + x2 + x3 1 + 1 + 0 2 = = . 3 3 3

The observed sample mean is the average of all values in an observed random sample. (i) Given an observed random sample, e.g. (x1 , x2 , x3 ) = (1, 1, 0), we can calculate the corresponding observed sample variance as (x1 − x¯) + (x2 − x¯) + (x3 − x¯) 1/9 + 1/9 + 4/9 1 = = . s = 3−1 2 3 2

2

2

2

The observed sample variance measures the dispersion across the observed sample variances. Page 1112, Table of Contents

www.EconsPhDTutor.com

91.18

Answers for Ch. 72: Null Hypothesis Significance Testing

Answer to Exercise 263. Let µ be the probability that a coin-flip is heads. The null and alternative hypotheses are H0 ∶ µ = 0.5 and HA ∶ µ > 0.5.

Our random sample is 20 coin-flips: (X1 , X2 , . . . , X20 ), where Xi takes on the value 1 if the ith coin-flip is heads and 0 otherwise. Our test statistic is the number of heads: T = X1 + X2 + ⋅ ⋅ ⋅ + X20 .

In our observed random sample (x1 , x2 , . . . , x20 ), there are 17 heads. So the observed test statistic is t = 17. Assuming H0 were true, we’d have T ∼ B (20, 0.5). Thus, the p-value is =

P (T ≥ 17∣H0 ) = P (T = 17∣H0 ) + P (T = 18∣H0 ) + P (T = 19∣H0 ) + P (T = 20∣H0 )

⎛ 20 ⎞ 17 3 ⎛ 20 ⎞ 18 2 ⎛ 20 ⎞ 19 1 ⎛ 20 ⎞ 20 0 0.5 0.5 + 0.5 0.5 + 0.5 0.5 + 0.5 0.5 ≈ 0.0013. ⎝ 17 ⎠ ⎝ 18 ⎠ ⎝ 19 ⎠ ⎝ 20 ⎠

Since p ≈ 0.0013 < α = 0.05, we reject H0 at the 5% significance level.

Answer to Exercise 264. Let µ be the true long-run proportion of coin-flips that are heads. The null and alternative hypotheses are H0 ∶ µ = 0.5 and HA ∶ µ ≠ 0.5.

Our random sample is 20 coin-flips: (X1 , X2 , . . . , X20 ), where Xi takes on the value 1 if the ith coin-flip is heads and 0 otherwise. Our test statistic is the number of heads: T = X1 + X2 + ⋅ ⋅ ⋅ + X20 .

In our observed random sample (x1 , x2 , . . . , x20 ), there are 17 heads. So the observed test statistic is t = 17. Assuming H0 were true, we’d have T ∼ B (20, 0.5). Thus, the p-value is =

P (T ≥ 17, T ≤ 3∣H0 ) = P (T = 0∣H0 ) + ⋅ ⋅ ⋅ + P (T = 3∣H0 ) + P (T = 17∣H0 ) + ⋅ ⋅ ⋅ + P (T = 20∣H0 )

⎛ 20 ⎞ 0 20 ⎛ 20 ⎞ 1 19 ⎛ 20 ⎞ 17 3 ⎛ 20 ⎞ 20 0 0.5 0.5 + 0.5 0.5 + 0.5 0.5 + ⋅ ⋅ ⋅ + 0.5 0.5 ≈ 0.0026. ⎝ 0 ⎠ ⎝ 1 ⎠ ⎝ 17 ⎠ ⎝ 20 ⎠

Since p ≈ 0.0026 < α = 0.05, we reject H0 at the 5% significance level. Page 1113, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 265. Let µ be the probability that a coin-flip is heads. (a) The competing hypotheses are H0 ∶ µ = 0.5, HA ∶ µ > 0.5.

The test statistic T is the number of heads (out of the 20 coin-flips). For t = 14, the corresponding p-value is

P (T ≥ 14∣H0 ) = P (T = 14∣H0 true) + P (T = 15∣H0 true) + ⋅ ⋅ ⋅ + P (T = 20∣H0 true) =

⎛ 20 ⎞ 20 0 ⎛ 20 ⎞ 14 6 ⎛ 20 ⎞ 15 5 0.5 0.5 ≈ 0.05766. 0.5 0.5 + ⋅ ⋅ ⋅ + 0.5 0.5 + ⎝ 20 ⎠ ⎝ 15 ⎠ ⎝ 14 ⎠

For t = 15, the corresponding p-value is

P (T ≥ 15∣H0 ) = P (T = 15∣H0 true) + P (T = 15∣H0 true) + ⋅ ⋅ ⋅ + P (T = 20∣H0 true) =

⎛ 20 ⎞ 14 6 ⎛ 20 ⎞ 15 5 ⎛ 20 ⎞ 20 0 0.5 0.5 + 0.5 0.5 + ⋅ ⋅ ⋅ + 0.5 0.5 ≈ 0.02069. ⎝ 15 ⎠ ⎝ 15 ⎠ ⎝ 20 ⎠

Thus, the critical value is 15 (this is the value of t at which we are just able to reject H0 at the α = 0.05 significance level). And the critical region is {15, 16, . . . , 20} (this is the set of values of t at which we’d be able to reject H0 at the α = 0.05 significance level). (b) The competing hypotheses are H0 ∶ µ = 0.5, HA ∶ µ ≠ 0.5.

The test statistic T is the number of heads (out of the 20 coin-flips). For t = 14, the corresponding p-value is

P (T ≥ 14, T ≤ 6∣H0 ) = 1 − P (7 ≤ T ≤ 13∣H0 ) = 1 − [P (T = 7∣H0 true) + P (T = 8∣H0 true) + ⋅ ⋅ ⋅ + P (T = 13∣H0 true)] ⎤ ⎡ ⎢⎛ 20 ⎞ 7 13 ⎛ 20 ⎞ 8 12 ⎛ 20 ⎞ 13 7 ⎥⎥ ⎢ =1−⎢ 0.5 0.5 + 0.5 0.5 + ⋅ ⋅ ⋅ + 0.5 0.5 ⎥ ≈ 0.1153. ⎝ 8 ⎠ ⎝ 13 ⎠ ⎢⎝ 7 ⎠ ⎥ ⎦ ⎣

For t = 15, the corresponding p-value is

P (T ≥ 15, T ≤ 5∣H0 ) = 1 − P (6 ≤ T ≤ 14∣H0 ) = 1 − [P (T = 6∣H0 true) + P (T = 7∣H0 true) + ⋅ ⋅ ⋅ + P (T = 14∣H0 true)] ⎤ ⎡ ⎢⎛ 20 ⎞ 6 14 ⎛ 20 ⎞ 7 13 ⎥ ⎛ ⎞ 20 13 7 = 1 − ⎢⎢ 0.5 0.5 + 0.5 0.5 + ⋅ ⋅ ⋅ + 0.5 0.5 ⎥⎥ ≈ 0.1153. ⎝ 7 ⎠ ⎝ 14 ⎠ ⎥ ⎢⎝ 6 ⎠ ⎣ ⎦

Thus, the critical value is 15 and the critical region is {15, 16, . . . , 20}. Page 1114, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 266. The competing hypotheses are:

The observed sample mean is x¯ =

H0 ∶ µ = 34, HA ∶ µ ≠ 34.

35 + 35 + 31 + 32 + 33 + 34 + 31 + 34 + 35 + 34 = 33.4. 10

The corresponding p-value is

¯ ≥ 33.4, X ¯ ≤ 34.6∣H0 ) = P (X ¯ ≥ 33.4∣H0 ) + P (X ¯ ≤ 34.6∣H0 ) p = P (X

⎛ ⎛ 33.4 − 34 ⎞ 34.6 − 34 ⎞ =P Z≥ √ +P Z ≤ √ ≈ 0.5271. ⎝ ⎝ 9/10 ⎠ 9/10 ⎠

The large p-value does not cast doubt on or provide evidence against H0 . We fail to reject H0 at the α = 0.05 significance level. Answer to Exercise 267. The competing hypotheses are:

The observed sample mean is x¯ = 33.4.

H0 ∶ µ = 34, HA ∶ µ ≠ 34.

The corresponding p-value is

¯ ≤ 33.4, X ¯ ≥ 34.6∣H0 ) = P (X ¯ ≤ 33.4∣H0 ) + P (X ¯ ≥ 34.6∣H0 ) p = P (X

⎛ 33.4 − 34 ⎞ 34.6 − 34 ⎞ CLT ⎛ ≈ P Z≤ √ +P Z ≥ √ ≈ 0.04550. ⎝ ⎝ 9/100 ⎠ 9/100 ⎠

The large p-value casts doubt on or provides evidence against H0 . We reject H0 at the α = 0.05 significance level. Page 1115, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 268. The competing hypotheses are: H0 ∶ µ = 34, HA ∶ µ ≠ 34.

The observed sample mean is x¯ = 33.4. And the observed sample variance is s2 = 11.2. The corresponding p-value is

¯ ≤ 33.4, X ¯ ≥ 34.6∣H0 ) = P (X ¯ ≤ 33.4∣H0 ) + P (X ¯ ≥ 34.6∣H0 ) p = P (X

⎛ 33.4 − 34 ⎞ 34.6 − 34 ⎞ CLT ⎛ ≈ P Z≤√ +P Z ≥ √ ≈ 0.07300. ⎝ ⎝ 11.2/100 ⎠ 11.2/100 ⎠

The fairly small p-value casts some doubt on or provides some evidence against H0 . But we fail to reject H0 at the α = 0.05 significance level.

Page 1116, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 269. The competing hypotheses are: H0 ∶ µ = 34 and HA ∶ µ ≠ 34. The observed sample mean And observed sample variance are x¯ =

35 + 35 + 31 + 32 + 33 + 34 + 31 + 34 + 35 + 34 = 33.4, 10

(35 − 33.4) + (35 − 33.4) + ⋅ ⋅ ⋅ + (34 − 33.4) ∑ (xi − x¯) s = = ≈ 2.489. n−1 10 − 1 2

2

2

2

2

The corresponding p-value is

¯ ≥ 33.4, X ¯ ≤ 34.6∣H0 ) = P (X ¯ ≥ 33.4∣H0 ) + P (X ¯ ≤ 34.6∣H0 ) p = P (X

⎛ ⎛ 34.6 − 34 ⎞ 33.4 − 34 ⎞ + P T9 ≤ √ ≈ 0.2598. = P T9 ≥ √ ⎝ ⎝ 2.489/10 ⎠ 2.489/10 ⎠

The large p-value does not cast doubt on or provide evidence against H0 . We fail to reject H0 at the α = 0.05 significance level.

Answer to Exercise 270. The observed sample mean is x¯ = 68 and the observed sample variance (use Fact 81(a)) is s2 =

− [∑i=1n xi ] 50 × 5000 − (68×50) 50 = ≈ 383.7. n−1 49

n ∑i=1 x2i

n

2

2

Let µ be the true average weight of a Singaporean. The competing hypotheses are H0 ∶ µ = 75 and HA ∶ µ < 75.

(This is a one-tailed test, because your friend’s claim is that the average American is heavier than the average Singaporean. If the claim were instead that the average American’s weight is different from the average Singaporean’s, then we’d have a two-tailed test.) Since the sample size n = 50 is “large enough”, we can appeal to the CLT. The p-value is ⎛ 68 − 75 ⎞ ¯ ≤ 68∣H0 ) CLT p = P (X ≈ P Z≤√ ≈ 0.0058. ⎝ 383.7/50 ⎠

The small p-value casts doubt on or provides evidence against H0 . We can reject H0 at any conventional significance level (α = 0.1, α = 0.05, or α = 0.01). Page 1117, Table of Contents

www.EconsPhDTutor.com

91.19

Answers for Ch. 73: Correlation and Linear Regression

Answer to Exercise 271.

1200

q

1000 800 600 400 200 p ($) 0 0

2

4

6

8

10

12

Answer to Exercise 272. Compute p¯ = (8 + 9 + 4 + 10 + 8) /5 = 7.8 and q¯ = (300 + 250 + 1000 + 400 + 400) /5 = 470. Also, n

∑ (pi − p¯) (qi − q¯) = (8 − p¯) (300 − q¯) + (9 − p¯) (250 − q¯) + ⋅ ⋅ ⋅ + (8 − p¯) (400 − q¯) i=1

= (8 − 7.8) (300 − 470) + (9 − 7.8) (250 − 470) + ⋅ ⋅ ⋅ + (8 − 7.8) (400 − 470) = −2480, ¿ √ Án Á À∑ (pi − p¯)2 = (8 − p¯)2 + (9 − p¯)2 + (4 − p¯)2 + (10 − p¯)2 + (8 − p¯)2

√ √ 2 2 2 2 2 = (8 − 7.8) + (9 − 7.8) + (4 − 7.8) + (10 − 7.8) + (8 − 7.8) = 20.8 ≈ 4.56070170, i=1

¿ √ Án 2 Á À∑ (qi − q¯) = (300 − q¯)2 + (250 − q¯)2 + ⋅ ⋅ ⋅ + (400 − q¯)2

√ √ 2 2 2 = (300 − 470) + (250 − 470) + ⋅ ⋅ ⋅ + (400 − 470) = 368000 ≈ 606.63003552. i=1

Thus,

n −2480, ∑i=1 (pi − p¯) (qi − q¯) r=√ ≈ ≈ −0.8964. √ 2 2 4.56070170 × 606.63003552 n n ∑i=1 (pi − p¯) ∑i=1 (qi − q¯)

Page 1118, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 273.

(a) We already computed (in the previous exercise) that

n

n

p¯ = 7.8, q¯ = 470, ∑ (pi − p¯) (qi − q¯) = −2480 and ∑ (pi − p¯) = 20.8. So, i=1

2

i=1

n ˆb = ∑i=1 (pi − p¯) (qi − q¯) = −2480 ≈ −119.2 2 n 20.8 ∑i=1 (pi − p¯)

Thus, the regression line of q on p is q − q¯ = ˆb (p − p¯) or q − 470 = −119.2 (p − 7.8) or q = 1400 − 119.2p. (b)

i 1 2 3 4 5 pi ($) 8 9 4 10 8 qi 300 250 1000 400 400 446 327 923 208 406 qˆi uˆi = qi − qˆi −146 −77 77 192 −46

q

1000 900 800 700 600 500 400 300 200 100

p ($)

0 (c)

0

2

5

4

6

8

10

(d) The SSR is ∑ uˆ2i ≈ (−146) + (−77) + 772 + 1922 + (−46) = 72308. 2

2

2

i=1

Page 1119, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 274. After Step 1.

After Step 2.

After Step 3.

After Step 4.

After Step 5.

After Step 6.

After Step 7.

After Step 8.

After Step 9.

After Step 10.

After Step 11.

After Step 12.

The TI84 tells us that r = −.8963881445 and the regression line is y = ax+b = −119.2307692+ 1400. This is indeed consistent with the answers from the previous exercises.

Answer to Exercise 275. In the previous exercises, we already calculated that the OLS line of best fit is q = 1400 − 119.2p. Thus,

(a) By interpolation, a barber who charged $7 per haircut sold 1400 − 119.2 × 7 ≈ 566 haircuts.

(b) By extrapolation, a barber who charged $200 per haircut sold 1400−119.2×200 = −22440 haircuts. This is plainly absurd. The second prediction is obviously absurd and thus obviously less reliable than the first.

Page 1120, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 276.

(a) r ≈ 0.954.

(b) r ≈ 0.984.

Page 1121, Table of Contents

www.EconsPhDTutor.com

92

Answers to Exercises in Part VII (2006-2015 A-Level Exams) 92.1

Answers for Ch. 74: Functions and Graphs

dy a Answer to Exercise 277 (9740 N2015/I/1). (i) Compute = −2 3 + b. From the dx x information given, we have this system of equations: a 1 + 1.6b + c = −2.4, 2 1.6

a

(−0.7)

2

− 0.7b + c = 3.6, 2

R a dy RRRR 3 RRR = −2 3 + b = 2. dx RR 1 Rx=1

So a ≈ −3.593, b ≈ −5.187, c ≈ 7.303 (calculator). (ii) −

a + bx + c = 0 Ô⇒ x ≈ −0.589 (calculator). x2

(iii) As x → ±∞, y → bx + c. Hence, the other asymptote is y = bx + c or y = 5.187x + 7.303.

Page 1122, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 278 (9740 N2015/I/2). (i) You can easily graph these two equations on your calculator and copy. But as an exercise, I’ll do this without a calculator. x+1 x+1 2 First draw the graph of y = . Write = −1 + . This is a hyperbola with two 1−x 1−x 1−x distinct branches. • Intercepts. The graph of y = axis at (−1, 0).

x+1 crosses the vertical axis at (0, 1) and the horizontal 1−x

x+1 x+1 → ±∞, so that the graph of y = has 1−x 1−x vertical asymptote x = 1. And as x → ±∞, (x + 1)/(1 − x) → −1, so that the graph of x+1 y= has horizontal asymptote y = −1. 1−x • The centre is thus (1, −1). • The two lines of symmetry run through the centre and bisect the angles formed by the asymptotes. • Asymptotes. Observe that as x → 1,

x+1 x+1 , it is easy to draw the graph of y = ∣ ∣ — simply 1−x 1−x x+1 where y < 0 in the horizontal axis. We can also reflect the parts of the graph of y = 1−x draw y = x + 2.

Armed with the graph of y =

(0, 1) y Vertical intercept

y

x=1 vertical asymptote

(-1, 0) Horizontal intercept

y = -1 horizontal asymptote

y=x+2

x=1 vertical asymptote x

R y = -1 horizontal asymptote P

Q

(0, 1) x (-1, 0) Vertical intercept Horizontal intercept

(... Answer continued on the next page ...)

Page 1123, Table of Contents

www.EconsPhDTutor.com

(... Answer continued from the previous page ...)

Answer to Exercise 278 (9740 N2015/I/2) (ii) Using your graphing calculator, the intersection points have x-coordinates Px ≈ −1.732, Qx ≈ 0.414, and Rx ≈ 1.732. Hence, the inequality holds if and only if x ∈ (−1.732, 0.414) ∪ (1.732, ∞). As an exercise, let me also do this without a calculator. The equation ∣(x + 1)/(1 − x)∣ = x+2 ⇐⇒

Now,

x+1 = x + 2 AND x ∈ [−1, 1)” OR 1−x

(a)

”

(b) ” −

x+1 = x + 2 AND x ∉ [−1, 1)”. 1−x

√ x+1 = x + 2 ⇐⇒ x + 1 = (1 − x)(x + 2) ⇐⇒ x2 + 2x − 1 = 0 ⇐⇒ x = −1 ± 2, 1−x −

√ x+1 = x + 2 ⇐⇒ −x + 1 = (1 − x)(x + 2) ⇐⇒ x2 − 3 = 0 ⇐⇒ x = ± 3. 1−x

√ So condition (a) is equivalent to x = −1 + 2. This is the x-coordinate of Q. √ And condition (b) is equivalent to x = ± 3. These are the x-coordinates of P and R.

Altogether then,

∣

√ √ √ x+1 ∣ = x + 2 ⇐⇒ x ∈ (− 3, −1 + 2) ∪ ( 3, ∞) . 1−x

Page 1124, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 279 (9740 N2015/I/5). (i) First, move (the graph of the equation 2 2 y = x2 ) 3 units rightwards to get the graph of the equation y = (x − 3)2. Then stretch it vertically, outwards from the horizontal axis by a factor of 0.25 to get the graph of the 2 equation y = 0.25(x − 3)2 .

(ii) The easy way is to use your graphing calculator and copy. But as an exercise, I’ll do it without a calculator. f (1) = 1 and lim f (x) = 1. Similarly, f (3) = 0 and lim f (x) = 0. (These say that f is x→1 x→3 x→1 x→3 continuous at both 1 and 3.)

(iii) Again, the easy way is by graphing calculator, but again as an exercise, let’s also do it without a calculator. Method #1: Mechanically do the algebra. First replace each x with 0.5x.

⎧ ⎧ ⎧ ⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ 1 for 0 ≤ x ≤ 1, 1 for 0 ≤ 0.5x ≤ 1,for 0 ≤ 0.5x ⎪ ⎪ ⎪ ⎪ 1 1 for 0 ≤ x ≤ 1, ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ (x) = ⎨ 2 ⎪ f (0.5x) = ⎨2 2 Ô⇒ ⎨0.25(0.5x − 3)⎪0.25(0.5x Ô⇒ f (0.5x) = f (x) = ⎨f0.25(x 3)2 ≤ 3,forÔ⇒ 1 < 0.5x 0.25(x − 3) for 1 < x ≤ 3, for 1 < −0.5x − 3) for 1 < x ≤ 3, ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ otherwise. 0 otherwise. ⎪ ⎪ 0 0 otherwise. ⎩0 otherwise. ⎩ ⎩ ⎩ ⎧ ⎧ ⎪ ⎪ ⎧ ⎪ ⎪ 22, 1 for 0 ≤ x ≤ 2, ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ 2 for 0 ≤ x ≤ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ 2 ⎪ 12 + f (0.5x) = ⎨1 + 0.25(0.5x − 3)2 Ô⇒ f (0.5x) = ⎨0.25(0.5x − 3) for 2 0.25(0.5x < x ≤ 6, Ô⇒ ⎨ 1 + f (0.5x) = ⎪ ⎪ 1 + − 3) for 2 < x ≤ 6, ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ otherwise. ⎪ ⎩1 ⎩0 ⎪ otherwise.⎪ ⎩1

for 0 ≤

for 2 <

otherw

Method (red) graph. graph. For For the the piece piece 00 ≤ ≤x x≤ ≤ 1, 1, Method #2: #2: Reason Reason it it through. through. Take Take the the original original (red) stretch by aa factor factor of of 2. 2. Then Then shift shift up up by by 11 stretch horizontally, horizontally, outwards outwards from from the the vertical vertical axis, axis, by unit. for the the piece piece −1 −1 ≤ ≤x x≤ ≤ 0. 0. unit. Do Do the the same same for for the the piece piece 11 ≤ ≤x x≤ ≤ 4, 4, and and again again for

Page 1125, 1125, Table Table of of Contents Contents Page

www.EconsPhDTutor.com www.EconsPhDTutor.com

Answer to Exercise 280 (9740 N2015/II/3). (a) (i) The range of f is (−∞, 0). Pick 1 1 any element y in the range of f and write: y = ⇐⇒ = 1 − x2 (the division by y is 2 1√ −x y 2 permissible ∵ y ≠ 0) ⇐⇒ x = 1 − 1/y ⇐⇒ x = ± 1 − 1/y. We can reject the negative value of x since x > 1. Altogether then, every y in the range √ corresponds to a unique element in the domain, namely 1 − 1/y. And so this function is invertible.

(ii) From our work above, the inverse function is f −1 ∶ (−∞, 0) → (1, ∞) defined by y ↦ √ 1 − 1/y.

2+x . Rearranging, yx2 + x + 2 − y = 0. This is a quadratic. Since x ∈ R, the 1 − x2 determinant of this quadratic must be non-negative — 12 − 4y(2 − y) ≥ 0. (b) Let y =

Rearranging, 4y 2 − 8y + 1 ≥ 0. This is a ∪-shaped quadratic with zeros 8±

Hence,

√

√ (−8)2 − 4(4)(1) = 1 ± 0.5 3. 2(4)

12 − 4y(2 − y) ≥ 0

⇐⇒

√ √ y ∈ (−∞, 1 − 0.5 3] ∪ [1 + 0.5 3, ∞) .

Answer to Exercise 281 (9740 N2014/I/1). (i) f 2 (x) = f (

1 1 1−x 1−x 1 )= = = =1− . 1−x 1 − 1/(1 − x) 1 − x − 1 −x x

To show that f 2 (x) = f −1 (x), we need merely show that f 2 (y) = x ⇐⇒ f (x) = y. To this end, write f 2 (y) = x ⇐⇒ 1 −

1 1 1 1 = x ⇐⇒ f (x) = f (1 − ) = = = y. y y 1 − (1 − 1/y) 1/y

(ii) f 3 (x) = f f 2 (x) = f f −1 (x) = x. Page 1126, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 282 (9740 N2014/I/4). (i) Recall that any y 2 = f (x) graph is symmetric in the horizontal axis (see section 15.9). Also, it is empty where f (x) < 0. So the graph of y 2 = f (x) is empty to the left of A and between B and C. √ We note that y = f (x) has turning point (0, d). And so y 2 = f (x) has turning points (0, d) √ and (0, − d).

Finally, the curve y 2 = f (x) crosses the x-axis at the same points as the curve y = f (x), namely A, B, and C.

y

D = (0, d) Vertical intercept

A = (-a, 0)

B = (b, 0)

C = (c, 0) x

y = f(x) y2 = f(x) Horizontal intercept for both graphs (ii) The tangents to the curve y 2 = f (x) at the points where it cross the x-axis are vertical. Page 1127, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 283 (9740 N2014/II/1). (i) We are given that dy dy dx 6 1 = ÷ = = = 0.4. dx dt dt 6t t

So t = 2.5.

(ii) The writers of this question seem to assume that p = 0, so that is what I’ll assume too.95

The tangent line at (3p2 , 6p) has equation y − 6p = y-axis, we have y − 6p =

1 (x − 3p2 ). Where this line meets the p

1 (0 − 3p2 ) = −3p or y = 3p. So D = (0, 3p). p

3p2 + 0 6p + 3p , ) = (1.5p2 , 4.5p). So the cartesian equation for the 2 2 2 locus of the mid-point of P D is x = 1.5 (y/4.5) = y 2 /13.5.

The mid-point of P D is (

Answer to Exercise 284 (9740 N2013/I/2). Rearrange the given equation: xy − y = x2 + x + 1 ⇐⇒ x2 + (1 − y)x + 1 + y = 0. This is a quadratic. We are given that x ∈ R and this is so if and only if the discriminant of the quadratic is non-negative, that is, (1 − y)2 − 4(1)(1 + y) = y 2 − 6y − 3 ≥ 0. This is a ∪-shaped quadratic with zeros y=

6±

√

√ √ (−6)2 − 4(1)(−3) = 3 ± 12 = 3 ± 2 3. 2

√ √ So the last inequality is true if and only if y ∈ (−∞, 3 − 2 3] ∪ [3 + 2 3, ∞).

95

If p = 0, then P = (0, 0) and the tangent at P is vertical, so that D could be any point on the y-axis.

Page 1128, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 285 (9740 N2013/I/3). (i)

x+1 1.5 = 0.5 + . 2x − 1 2x − 1

• Intercepts. The graph crosses the vertical axis at (0, −1) and the horizontal axis at (−1, 0).

• Asymptotes. As x → 0.5, y → ±∞. Hence, x = 0.5 is a vertical asymptote. As x → ±∞, y → 0.5. Hence, y = 0.5 is a horizontal asymptote.

The intersection of the two asymptotes is (0.5, 0.5) — this is also the centre of the hyperbola. There are two lines of symmetry, each running through the centre, and each bisecting an angle formed by the two asymptotes.

y y=1-x Line of symmetry

(0.5, 0.5) Centre

y = 0.5 horizontal asymptote

y=x Line of symmetry

y = (x + 1) / (2x - 1) x

(-1, 0) Horizontal intercept

x = 0.5 vertical asymptote

(0, -1) Vertical intercept x+1 (ii) < 1 ⇐⇒ “x + 1 < 2x − 1 AND 2x − 1 > 0” OR “x + 1 > 2x − 1 AND 2x − 1 < 0” 2x − 1 ⇐⇒ “2 < x AND x > 0.5” OR “2 > x AND x < 0.5” ⇐⇒ “x > 2 OR x < 0.5”.

Page 1129, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 286 (9740 N2013/II/1). (i) The range of g is not a subset of the domain of f — for example, 1 ∈ g(R), but 1 is not in the domain of f . (ii) gf (x) = g (

2+x 2+x )=1−2 . 1−x 1−x

gf (x) = 5 ⇐⇒ 1 − 2

(gf )−1 (5) = 4.

2+x 2+x = 5 ⇐⇒ 4 = −2 ⇐⇒ 2x − 2 = 2 + x ⇐⇒ x = 4. So 1−x 1−x

Answer to Exercise 287 (9740 N2012/I/1). Let x, y, and z be the costs of, respec1 tively, the under-16, 16-65, and over-65 tickets. The system of equations is 9x + 6y + 4z = 2 3 $162.03, 7x + 5y + 3z = $128.36, 10x + 4y + 5z = $158.50. So x = $7.65, y = $9.85, z = $8.52 (calculator).

Page 1130, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 288 (9740 N2012/I/7). (i) Let g −1 = g. It suffices to show that g −1 (g(x)) = x: g(x) + k g (g(x)) = = g(x) − 1 −1

x+k x−1 x+k x−1

+k

−1

=

x + k + k(x − 1) x(1 + k) = = x. x + k − (x − 1) k+1

(ii) Intercepts. The graph crosses the vertical axis at (0, −k) and the horizontal axis at (−k, 0).

Asymptotes. As x → 1, y → ±∞. Hence, the graph has a vertical asymptote x = 1. Moreover, as x → ±∞, y → 1. Hence, the graph has a horizontal asymptote y = 1.

y

y = -x Line of symmetry

(-k, 0) Horizontal intercept

y = (x + k) / (x - 1)

(1, 1) Centre

y=1 horizontal asymptote x

y=x Line of symmetry

x=0 vertical asymptote

(0, -k) Vertical intercept

(iii) Since g is self-inverse, a line of symmetry is y = x. Observe that

x+k k+1 1 x+k =1+ . Hence, to transform y = into y = : x−1 x−1 x x−1

1. Move the graph rightwards by 1 unit to get the graph of y =

Page 1131, Table of Contents

1 . x−1

www.EconsPhDTutor.com

2. Stretch it vertically by a factor of k + 1, outwards from the horizontal axis to get the k+1 graph of y = . x−1 k+1 x+k 3. Move the graph upwards by 1 unit to get the graph of y = 1 + = . x−1 x−1

Page 1132, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 289 (9740 N2012/II/3). (i) Graph on your calculator and copy. y

y

y = |f(x )|

y = f(x )

x

x

(ii) f (x) = 4 ⇐⇒ x3 + x2 − 2x − 4 = 4 ⇐⇒ x3 + x2 − 2x − 8 = 0. By observation, x = 2 is an integer solution to this last equation.

So x3 +x2 −2x−8 = (x−2) (x2 + ax + b) = x3 +(a−2)x2 +(b−2a)x−2b. Comparing coefficients, a = 3 and b = 4.

So x3 + x2 − 2x − 8 = (x − 2) (x2 + 3x + 4). But x2 + 3x + 4 is a quadratic whose determinant is negative. Altogether then, 2 is the only real solution to the given equation.

(iii) The integer solution of f (x) = 4 is 2. The given equation is equivalent to f (x + 3) = 4. We know that only x + 3 = 2 solves this equation. Hence, the solution is −1.

(iv) Where f (x) < 0, reflect it on the horizontal axis. Where f (x) ≥ 0, keep it unchanged. Altogether, y = ∣f (x)∣ is graphed above. (v) ∣f (x)∣ = 4 ⇐⇒ ∣x3 + x2 − 2x − 4∣ = 4. 1

If x3 +x2 −2x−4 ≥ 0 so that ∣x3 + x2 − 2x − 4∣ = x3 +x2 −2x−4, then = becomes x3 +x2 −2x−4 = 4 ⇐⇒ x3 + x2 − 2x − 8 = 0. 1

We already found that this cubic equation has only one real root, namely 2.

If x3 +x2 −2x−4 < 0 so that ∣x3 + x2 − 2x − 4∣ = −x3 −x2 +2x+4, then = becomes −x3 −x2 +2x+4 = 4 ⇐⇒ 0 = x3 + x2 − 2x = x(x2 + x − 2) = x(x + 2)(x − 1). 1

This second cubic equation has three real roots, namely 0, −2, and 1.

Altogether, the equation ∣f (x)∣ = 4 has four real roots — −2, 0, 1, and 2.

Page 1133, Table of Contents

www.EconsPhDTutor.com

x2 + x + 1 Answer to Exercise 290 (9740 N2011/I/1). When is 2 < 0? x +x−2

The numerator x2 + x + 1 is a ∪-shaped quadratic with discriminant b2 − 4ac = 12 − 4(1)(1) = −3 < 0. So it is always positive. The above inequality thus holds if and only if the denominator is negative. Let’s check when the denominator is negative.

The denominator x2 + x − 2 is a ∪-shaped quadratic with zeros −2 and 1. So x2 + x − 2 < 0 if and only if x ∈ (−2, 1). Altogether then, the inequality holds if and only if x ∈ (−2, 1).

Answer to Exercise 291 (9740 N2011/I/2). (i) The given information forms this 1 2 2 2 2 system of equations: a (−1.5) + b (−1.5) + c = 4.5, a (2.1) + b (2.1) + c = 3.2, a (3.4) + 3 b (3.4) + c = 4.1. So a ≈ 0.215, b ≈ −0.490, and c ≈ 3.281 (calculator).

(ii) f ′ (x) = 2ax + b ≈ 0.430x − 0.490 > 0 if and only if x >

Page 1134, Table of Contents

49 ≈ 1.140. 43

www.EconsPhDTutor.com

Answer to Exercise 292 (9740 N2011/II/3). (i) The inverse function f −1 has domain R (this is simply the range of f ) and codomain (−0.5, ∞) (this is simply the domain of f ). To find the mapping rule, write y = f (x) = ln(2x + 1) + 3 ⇐⇒ y − 3 = ln(2x + 1) ⇐⇒ ey−3 = 2x + 1 ⇐⇒ 0.5 (ey−3 − 1) = x. And so f −1 has mapping rule y ↦ 0.5(ey−3 − 1).

(ii) For f : As x → −0.5, y → −∞, and so x = −0.5 is a vertical asymptote. The graph crosses the vertical axis at (0, 3) and the horizontal axis at (0.5(e−3 − 1), 0). For f −1 : As x → −∞, y → −0.5, and so y = −0.5 is a horizontal asymptote. The graph crosses the vertical axis at (0, 0.5(e−3 − 1)) and the horizontal axis at (3, 0).

y

x = - 0.5 Vertical asymptote for f (x) (0, 3) Vertical intercept

(0.5 [e -3 -1] , 0) Horizontal intercept (0, 0.5 [e -3 - 1]) Vertical intercept

y = f -1(x)

(3, 0) Horizontal intercept

y = f(x )

y = - 0.5 Horizontal asymptote for f -1(x)

x

(iii) If the graph of f intersects the line y = x, then it also intersects the graph of f −1 at the points where x = f (x). In this case, x = f (x) ⇐⇒ x = ln(2x + 1) + 3 Ô⇒ x ≈ −0.485, 5.482 (calculator).

Page 1135, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 293 (9740 N2010/I/5). (i) First, translate the graph 2 units rightwards to get the graph of y = (x − 2)3 .

The wording of the second step is ambiguous. I shall interpret “after the stretch of scale factor 0.5 parallel to the y-axis” to mean a horizontal stretch outwards from the vertical axis. If so, then after this stretch, we have the graph of y = (2x − 2)3 .96

Finally, translate the graph 6 units downwards to get the graph of y = (2x − 2)3 − 6. √ 3 The graph crosses the y-axis at (0, −14) and the x-axis at (0.5 6 + 1, 0).

In sketching our graph, it helps to keep in mind that the cubic equation y = x3 has a single stationary inflexion point and has no turning points. And thus, the same must be true for y = (2x − 2)3 − 6.

(ii) The graph of f −1 is simply the reflection of the graph of f in the line y = x.

y (0, 0.5 + 2) Vertical intercept of y = f -1(x) y = f -1(x) x

(-14, 0) Horizontal intercept of y = f -1(x)

(0.5 , 0) Horizontal intercept of y = f(x) y = f (x) = (2x - 2)3 - 6 (0, -14) Vertical intercept of y = f(x)

96

If instead this is interpreted to mean a vertical stretch outwards from the horizontal axis, then after the stretch, we have the graph of y = 2(x − 2)3 .

Page 1136, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 294 (9740 N2010/II/4). (i) Graph on your calculator and copy. The asymptotes are x = ±1.

y

y = f (x) y=0 Horizontal asymptote x

x = -1 Vertical asymptote (0, -1) Vertical intercept

x=1 Vertical asymptote

(ii) By observation, the graph of f is symmetric in the vertical axis and if the domain of f is restricted to R+0 , the new function thus formed would be invertible. Hence, the smallest k for which f −1 exists is k = 0. 1 1 (x − 3)2 )= = 2 1 2 x−3 ) − 1 1 − (x − 3) ( x−3 (x − 3)2 (x − 3)2 = = . [1 − (x − 3)] [1 + (x − 3)] (4 − x)(x − 2)

(iii) f g(x) = f (g(x)) = f (

(iv) The numerator is always positive.

The denominator (4 − x)(x − 2) is a ∩-shaped quadratic; so it is positive ⇐⇒ x ∈ (2, 4).

Altogether then, f g(x) > 0 ⇐⇒ x ∈ (2, 3) ∪ (3, 4) (note that 3 is excluded because it is not in the domain of f g). Page 1137, Table of Contents

www.EconsPhDTutor.com

(v) Method #1 (non-rigorous). Graph f g on your calculator. Observe that f g(x) is always less than −1 or more than 0. Be very careful to note that f g(x) ≠ 0 for any x, because 3 is not in the domain of f g. Hence, the range of f g is (−∞, −1) ∪ (0, ∞).

y y = fg (x)

x The point (3, 0) is not part of the graph of y = fg (x).

Method #2 (rigorous). Observe that (x − 3)2 x2 − 6x + 9 1 1 = 2 = −1 + 2 = −1 + . (4 − x)(x − 2) −x + 6x − 8 −x + 6x − 8 (4 − x)(x − 2)

Observe also that (4 − x)(x − 2) is a ∩-shaped quadratic with maximum value 1 (at x = 3). But given the restrictions that x ≠ 2, x ≠ 3, x ≠ 4, we have (4 − x)(x − 2) ∈ (−∞, 0) ∪ (0, 1). So

1 ∈ (−∞, 0) ∪ (1, ∞). (4 − x)(x − 2)

And −1 +

1 ∈ (−∞, −1) ∪ (0, ∞). The range of f g is (−∞, −1) ∪ (0, ∞). (4 − x)(x − 2)

Page 1138, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 295 (9740 N2009/I/1). Let an2 + bn + c be the quadratic poly1 nomial. The given information yields this system of equations: a (12 ) + b(1) + c = 10, a (22 ) + b(2) + c = 6, and a (32 ) + b(3) + c = 5. 2

3

So a = 1.5, b = −8.5, and c = 17 and the polynomial is 1.5n2 − 8.5n + 17 (calculator).

(ii) Either by hand or by calculator, 1.5n2 − 8.5n + 17 = 100 ⇐⇒ 1.5n2 − 8.5n − 83 = 0 ⇐⇒ 3n2 − 17n − 166 = 0 ⇐⇒ n=

17 ±

√

(−17)2 − 4(3)(−166) 17 ± = 6

√ √ 289 + 1992 17 ± 2281 = . 6 6

We can discard the negative root. The positive root that remains is approximately 10.8. Bearing in mind that n must be an integer, we conclude that the set of values for which un is greater than 100 is {11, 12, 13, . . . }.

Page 1139, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 296 (9740 N2009/I/6). (i) C1 is a rectangular hyperbola. Write x−2 4 y= =1− . This curve crosses the vertical axis at (0, −1) and the horizontal axis x+2 x+2 at (2, 0). And as x → −2, y → ±∞, and so x = −2 is a vertical asymptote. Moreover, as x → ±∞, y → 1, and so y = 1 is a horizontal asymptote. √ √ C2 is an ellipse that crosses the vertical axis at (0, − 3) and (0, 3), and the horizontal √ √ axis at (− 6, 0) and ( 6, 0). It has no asymptotes.

y y=x+3 Line of symmetry

(0.5, 0.5) Centre

(0, ) Vertical intercepts (-1, 0) Horizontal intercept

y = (x - 2) / (x + 1)

y = 0.5 horizontal asymptote x (0, -1) Vertical intercept ( , 0) Horizontal intercepts

x = 0.5 vertical asymptote

y=-x-1 Line of symmetry

(x − 2)2 (ii) Square both sides of the equation for C1 y = . Plug this into the equation for (x + 2)2 2

(x−2)2 (x+2)2

2(x − 2)2 = 6 ⇐⇒ x2 (x + 2)2 + 2(x − 2)2 = 6(x + 2)2 ⇐⇒ 2 6 3 (x + 2) 2 2 2 2 2(x − 2) = 6(x + 2) − x (x + 2) = (x + 2)2 (6 − x2 ), as desired. C2 :

x2

+

= 1 ⇐⇒ x2 +

(iii) They are −0.5149 and 2.445 (correct to 4 s.f.). Page 1140, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 297 (9740 N2009/II/3). (i) This is a rectangular hyperbola with a a a a horizontal asymptotes x = and y = . So the range of f is (−∞, ) ∪ ( , ∞). b b b b

a a Let f −1 have domain (−∞, ) ∪ ( , ∞) (this is simply the range of f ) and codomain b b a a (−∞, ) ∪ ( , ∞) (this is simply the domain of f ). For the mapping rule of f −1 , write b b ax ay y = f (x) = ⇐⇒ y(bx − a) = ax ⇐⇒ −ay = x(a − by) ⇐⇒ = x (the division bx − a by − a is permitted because y ≠ a/b). So f −1 has mapping rule y ↦

ay . by − a

Altogether then, f −1 = f . And thus f 2 (x) = f f (x) = f f −1 (x) = x. The range of f 2 is simply a a (−∞, ) ∪ ( , ∞). b b a a (ii) The range of g is (−∞, 0) ∪ (0, ∞), while the domain of f is (−∞, ) ∪ ( , ∞). Since b b a/b ≠ 0, the range of g is not a subset of the domain of f and so f g does not exist. (iii) We have f −1 (x) = x ⇐⇒

Thus, x = 0 or x = 2

ax = x ⇐⇒ ax = x(bx − a) ⇐⇒ 0 = x(bx − 2a). bx − a

a solves f −1 (x) = x. b

Page 1141, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 298 (9740 N2008/I/9).

(i) f ′ (x) =

(cx + d)a − (ax + b)c = (cx + d)2

ad − bc . Since ad − bc ≠ 0, f ′ (x) ≠ 0 for any x and so there are no turning points. 2 (cx + d)

(ii) If ad − bc = 0, then f ′ (x) = 0 for all x. Hence, the graph is simply a horizontal line.

dy 3 × 1 − (−7) × 2 17 = = which is always positive. Hence, the 2 dx (2x + 1) (2x + 1)2 graph has a positive gradient at all points. (iii) In this case,

y

y = (3x - 7) / (2x + 1)

x = - 0.5 vertical asymptote y = 1.5 horizontal asymptote

y= horizontal asymptotes

y2 = (3x - 7) / (2x + 1)

x

(7 / 3, 0) Horizontal intercept for both graphs (0, -7) Vertical intercept

(iv) (a) This is a rectangular hyperbola that crosses the vertical axis at (0, −7) and the 7 3x − 7 8.5 horizontal axis at ( , 0). Since = 1.5 − , there is a vertical asymptote x = −0.5 3 2x + 1 2x + 1 and a horizontal asymptote y = 1.5.

(b) The graph of y 2 = f (x) is symmetric in the horizontal axis. It crosses the horizontal √ 7 axis at ( , 0). It has vertical asymptote x = −0.5 and horizontal asymptotes y = ± 1.5. 3 Page 1142, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 299 (9233 N2008/I/14). (i) The curve crosses both the vertical x x and horizontal axes at (0, 0). Since 2 = , the horizontal asymptote is x − 1 (x + 1)(x − 1) y = 0 and the vertical asymptotes are x = ±1. (ii) y 2 = f (x) is symmetric in the horizontal axis and empty where f (x) < 0. At the origin, the tangent to the curve is vertical.

x=±1 vertical asymptotes

y y = x / (x2 - 1)

y=0 horizontal asymptote for both graphs

x y2 = x / (x2 - 1) (0, 0) Horizontal and vertical intercepts for both graphs

The intersection points of y =

x x x and y = e are given by = ex ⇐⇒ x = ex (x2 − 1) 2 2 x −1 x −1 ⇐⇒ xe−x = x2 − 1 ⇐⇒ 1 + xe−x = x2 , as desired. √ √ Try the starting value x0 = 2. Then x1 = 1 + x0 e−x0 = 1 + 2e−2 ≈ 1.12724. √ √ −x 1 x2 = 1 + x1 e = 1 + 1.12724e−1.12724 ≈ 1.16839. √ √ x3 = 1 + x2 e−x2 = 1 + 1.16839e−1.16839 ≈ 1.16757. √ √ x4 = 1 + x3 e−x3 = 1 + 1.16757e−1.16757 ≈ 1.16759. So the positive root is 1.17 (correct to 2 decimal places). Page 1143, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 300. (9740 N2008/II/4) (i) y = (x−4)2 +1 is a ∪-shaped quadratic equation that does not touch the horizontal axis. Be very careful to note that f has domain (4, ∞) and range (1, ∞). And so the graph of y = (x − 4)2 + 1 excludes the endpoint (4, 1).

9

y

y = f (x) The point (1, 4) is not part of the graph of y = f -1(x).

8 7

y = f -1(x)

6 5 4 3 y=x line

2

The point (4, 1) is not part of the graph of y = f (x).

1 0 -2

0

2

4

6

-1

8 x

(ii) The inverse function f −1 has domain (1, ∞) (this is simply the range of f ) and codomain 2 (4, ∞) (this is simply the domain of f ). For the mapping √ √rule, write y = f (x) = (x − 4) + 1 ⇐⇒ y − 1 = (x − 4)2 ⇐⇒ ± y − 1 = x − 4 ⇐⇒ x = 4 ∓ y − 1. √ We know that x > 4. Hence, f −1 has mapping rule y ↦ 4 + y − 1. (iii) See above.

(iv) Reflect the graph of f in the line y = x to get the graph of f −1 .

The solution √ to f (x) = f −1 (x) is given by f (x) = x or (x − 4)2 + 1 = x ⇐⇒ x2 − 9x + 17 = 0 √ 9 ± 92 − 4(17) 9 ± 13 ⇐⇒ x = = . We can reject the smaller root because it is less than 2 √2 9 + 13 4. Hence, the solution is . 2 Page 1144, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 301 (9740 N2007/I/1). 2x2 − x − 19 2x2 − x − 19 − (x2 + 3x + 2) x2 − 4x − 21 −1= = 2 , as desired. x2 + 3x + 2 x2 + 3x + 2 x + 3x + 2

And so

2x2 − x − 19 x2 − 4x − 21 > 1 ⇐⇒ > 0. x2 + 3x + 2 x2 + 3x + 2

The numerator x2 − 4x − 21 is a ∪-shaped quadratic with zeros −3, 7. So x2 − 4x − 21 > 0 ⇐⇒ “x < −3 or x > 7”. Also, x2 − 4x − 21 < 0 ⇐⇒ “x ∈ (−3, 7)”.

The denominator x2 + 3x + 2 is a ∪-shaped quadratic with zeros −1, −2. So x2 + 3x + 2 > 0 ⇐⇒ ”x < −2 or x > −1”. Also, x2 + 3x + 2 < 0 ⇐⇒ ”x ∈ (−1, −2)”.

x2 − 4x − 21 Altogether then, 2 > 0 ⇐⇒ one of the following is true: x + 3x + 2

1. “x < −3 OR x > 7” AND “x < −2 OR x > −1” ⇐⇒ “x < −3 or x > 7”; OR 2. “x ∈ (−3, 7) AND x ∈ (−2, −1)” ⇐⇒ “x ∈ (−2, −1)”.

Altogether then, the given inequality holds if and only if x ∈ (−∞, −3) ∪ (−2, −1) ∪ (7, ∞).

Answer to Exercise 302 (9740 N2007/I/2). (i) f has domain (−∞, 3) ∪ (3, ∞) and range (−∞, 0) ∪ (0, ∞). g has domain R and range R.

Hence, the range of f is a subset of the domain of g. And so the composite function gf exists. It has the same domain as f , namely (−∞, 3) ∪ (3, ∞) and the same codomain as g, 1 1 )= . namely R. Its mapping rule is gf ∶ x ↦ g (f (x)) = g ( x−3 (x − 3)2

In contrast, the range of g is not a subset of the domain of f . And so the composite function f g does not exist. 1 (ii) To find the mapping rule for the inverse function f −1 , write: y = f (x) = ⇐⇒ x−3 1 1 = x − 3 (division is permitted ∵y ≠ 0) ⇐⇒ + 3 = x. y y

Hence, the inverse function f −1 has domain (−∞, 0) ∪ (0, ∞) (this is simply the range of f ), 1 codomain (−∞, 3) ∪ (3, ∞) (this is simply the domain of f ), and mapping rule y ↦ + 3. y

Page 1145, Table of Contents

www.EconsPhDTutor.com

2x + 7 3 Answer to Exercise 303. (9740 N2007/I/5) Write y = = 2+ . The sequence x+2 x+2 of transformations is:

1 . x+2 2. Stretch it vertically by a factor of 3, outwards from the horizontal axis to get the graph 3 of y = . x+2 3 3. Move it up by 2 units to get the graph of y = 2 + . x+2 1. Move the graph of y = 1/x to the left by 2 units to get the graph of y =

• Intercepts. The graph intersects the vertical axis at (0, 3.5) and the horizontal axis at (−3.5, 0). • Asymptotes. As x → −2, y → ±∞ and so x = −2 is a vertical asymptote. Also, as x → ±∞, y → 2 and so y = 2 is a horizontal asymptote.

y=-x Line of symmetry

y y = (2x + 7) / (x + 2) (-2, 2) Centre y=x +4 Line of symmetry

y=2 horizontal asymptote (0, -3.5) Vertical intercept

x

x = -2 vertical asymptote (-3.5, 0) Horizontal intercept

Page 1146, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 304 (9740 N2007/II/1). Let x, y, and z be the prices per kilogram of, respectively, the pineapples, mangoes, and lychees. The system of equations 1 2 3 is 1.15x + 0.6y + 0.55z = 8.28, 1.2x + 0.45y + 0.3z = 6.84, and 2.15x + 0.9y + 0.65z = 13.05. So x = 3.50, y = 2.6, z = 4.9, and the total amount paid by Lee Lian was 7.65 dollars (calculator).

13 4x + 1 = 4+ . So x = 3 Answer to Exercise 305. (9233 N2007/II/4) (i) Write y = x−3 x−3 is a vertical asymptote and y = 4 is a horizontal asymptote.

1 1 (ii) The graph intersects the vertical axis at (0, − ) and the horizontal axis at (− , 0). 3 4 y y = (4x + 1) / (x - 3) y=-x+7 Line of symmetry

(3, 4) Centre

y=x +1 Line of symmetry

y=4 horizontal asymptote x

(- 1 / 4, 0) Horizontal intercept

x=3 vertical asymptote (0, - 1 / 3) Vertical intercept

(iii) The range of f is (−∞, 4) ∪ (4, ∞), so this is also the domain of f −1 . Write y = f (x) = 13 13 13 4+ ⇐⇒ y − 4 = ⇐⇒ = x − 3 (the division is permitted ∵y ≠ 4) ⇐⇒ x−3 x−3 y−4 13 13 3+ = x. So f −1 (x) = 3 + . y−4 y−4

Page 1147, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 306 (9233 N2006/I/3). The domain of f is (0, ∞) and that of g is (0, ∞). The range of g is (0, ∞). The range of g is a subset of the domain of f and so the composite function f g exists. The range of g is a subset of the domain of g and so the composite functions g 2 and g 35 also exist. The composite function f g has domain (0, ∞), codomain (0, ∞), and mapping rule f g ∶ 15 3 + 3. x ↦ f (g(x)) = f ( ) = x x

The composite function g 2 has domain (0, ∞), codomain (0, ∞), and mapping rule g 2 ∶ x ↦ 3 3 g (g(x)) = g ( ) = 3/ ( ) = x. x x The composite function g 4 has domain (0, ∞), codomain (0, ∞), and mapping rule g 4 ∶ x ↦ g 2 (g 2 (x)) = g 2 (x) = x.

The composite function g 6 has domain (0, ∞), codomain (0, ∞), and mapping rule g 6 ∶ x ↦ g 2 (g 4 (x)) = g 2 (x) = x.

⋮

The composite function g 34 has domain (0, ∞), codomain (0, ∞), and mapping rule g 34 ∶ x ↦ g 2 (g 32 (x)) = g 2 (x) = x.

The composite function g 35 has domain (0, ∞), codomain (0, ∞), and mapping rule g 35 ∶ x ↦ g (g 34 (x)) = g(x) = 3/x.

h(x) = 5f (x) + 3.

Page 1148, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 307 (9233 N2006/II/1). As usual, rearrange the inequality into N N x−9 x − 9 − x2 + 9 what I call standard form ( > 0 or ≥ 0): 2 ≤ 1 ⇐⇒ ≤ 0 ⇐⇒ D D x −9 x2 − 9 x − x2 x(x − 1) ≤ 0 ⇐⇒ ≥0 x2 − 9 (x − 3)(x + 3)

The numerator x(x − 1) is a ∪-shaped quadratic with zeros 0 and 1. Hence, x(x − 1) ≥ 0 ⇐⇒ “x ≤ 0 OR x ≥ 1”. Also, x(x − 1) ≥ 0 ⇐⇒ “x ∈ [0, 1]”. The denominator (x − 3)(x + 3) is a ∪-shaped quadratic with zeros −3 and 3. Hence, (x − 3)(x + 3) > 0 ⇐⇒ “x < −3 OR x > 3”. Also, (x − 3)(x + 3) < 0 ⇐⇒ “x ∈ (−3, 3)”.

So

x(x − 1) ≥ 0 ⇐⇒ (x − 3)(x + 3)

1. “x ≤ 0 OR x ≥ 1” AND “x < −3 OR x > 3” ⇐⇒ “x < −3 or x > 3”; OR 2. “x ∈ [0, 1] OR x ∈ (−3, 3)” ⇐⇒ “x ∈ [0, 1]”.

Altogether then, the inequality holds when x ∈ (−∞, −3) ∪ [0, 1] ∪ (3, ∞).

Page 1149, Table of Contents

www.EconsPhDTutor.com

92.2

Answers for Ch. 75: Sequences and Series

50 = 2 50T + 49 × 50 = 50T + 2450 seconds to complete. The required time interval (in seconds) is [5400, 6300]. So we need 50T + 2450 ∈ [5400, 6300]. Or equivalently, we need T ∈ [59, 77]. Answer to Exercise 308 (9740 N2015/I/8). (i) Athlete A will take (2T +49×2)×

1 − 1.0250 = 50t(1.0250 − 1) seconds to complete. The required (ii) Athlete B will take t 1 − 1.02 time interval (in seconds) is [5400, 6300]. So we need 50t(1.0250 − 1) ∈ [5400, 6300]. Or equivalently, we need t ∈ [63.845, 74.486].

(iii) T = 59 and t = 63.845. So Athlete A completes the last lap in T + 49 × 2 = 157 s, while athlete B completes it in t × 1.0249 ≈ 168 s. And so the difference is 11 s.

Page 1150, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 309 (9740 N2015/II/4). (a) Step #1. Let P(k) stand for the proposition that k

∑ r(r + 2)(r + 5) =

r=1

1 k(k + 1)(3k 2 + 31k + 74). 12

Our goal is to show that P(k) is true for all k = 1, 2, 3, . . . Step #2. Verify that P(1) is true. 1 × 3 × 6 = 18 =

1 1(1 + 1)(3 × 12 + 31 × 1 + 74). ✓ 12

Step #3. Show that P(j) implies P(j + 1) (for all j = 1, 2, 3, . . . ). Assume that P(j) is true. That is, j

∑ r(r + 2)(r + 5) =

r=1

1 j(j + 1)(3j 2 + 31j + 74). 12

Our goal is to show that P(j + 1) is true. That is, j+1

∑ r(r + 2)(r + 5) =

r=1

To this end, write j+1

1 (j + 1) [(j + 1) + 1] [3(j + 1)2 + 31(j + 1) + 74] . 12

j

∑ r(r + 2)(r + 5) = ∑ r(r + 2)(r + 5) + (j + 1) [(j + 1) + 2] [(j + 1) + 5]

r=1

1

= 5

= 6

= 7

= 8

= 9

= 4

= 3

= 2

r=1

1 j(j + 1)(3j 2 + 31j + 74) + (j + 1) [(j + 1) + 2] [(j + 1) + 5] 12 1 j(j + 1)(3j 2 + 31j + 74) + (j + 1)(j + 3)(j + 6) 12 j+1 [j(3j 2 + 31j + 74) + 12(j + 3)(j + 6)] 12 j+1 (3j 3 + 31j 2 + 74j + 12j 2 + 108j + 216) 12 j+1 (3j 3 + 43j 2 + 182j + 216) 12 1 (j + 1)(j + 2) (3j 2 + 37j + 108) 12 1 (j + 1)(j + 2) (3j 2 + 6j + 3 + 31j + 31 + 74) 12 1 (j + 1) [(j + 1) + 1] [3(j + 1)2 + 31(j + 1) + 74] , as desired. 12

Page 1151, Table of Contents

www.EconsPhDTutor.com

(b) (i) Write

A B (2r + 3)A + (2r + 1)B + = 2r + 1 2r + 3 (2r + 1)(2r + 3) =

(2A + 2B)r + 3A + B . 4r2 + 8r + 3

So 2A + 2B = 0 ⇐⇒ A = −B and 3A + B = −2B = 2 ⇐⇒ B = −1 and A = 1. Thus,

(ii)

2 1 1 = − , 4r2 + 8r + 3 2r + 1 2r + 3

as desired.

n 2 1 1 = ∑( − ) ∑ 2 2r + 3 r=1 4r + 8r + 3 r=1 2r + 1 n

1 1 1 1 1 1 1 1 = ( − ) + ( − ) + ( − ) + ⋅⋅⋅ + ( − ) 3 5 5 7 7 9 2n + 1 2n + 3 =

(iii)

1 1 − . 3 2n + 3

1 ≤ 0.001 ⇐⇒ 1000 ≤ 2n + 3 ⇐⇒ n ≥ 498.5. So the smallest n is 499. 2n + 3

Page 1152, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 310 (9740 N2014/I/6). (i) Step #1. Let P(k) stand for the 1 proposition that pk = (7 − 4k ). Our goal is to show that P(k) is true for all k = 1, 2, 3, . . . 3

1 Step #2. Verify that P(1) is true: p1 = 1 = (7 − 41 ). 3

✓

Step #3. Show that P(j) implies P(j + 1) (for all j = 1, 2, 3, . . . ). 1 Assume that P(j) is true. That is, pj = (7 − 4j ). 3

1 Our goal is to show that P(j + 1) is true. That is, pj+1 = (7 − 4j+1 ). 3 To this end, write

1 = 4 [ (7 − 4j )] − 7 3 28 1 = − 7 − 4 × 4j = (7 − 4j+1 ), 3 3

pj+1 = 4pj − 7 as desired.

(ii)

n n 1 = ∑ pr = ∑ (7 − 4r ) r=1 r=1 3 n 1 n = (∑ 7 − ∑ 4r ) = 3 r=1 r=1 1 4(1 − 4n ) = [7n + ]= 3 3

(b) (i) As n → ∞, (ii)

1 n ∑(7 − 4r ) 3 r=1 1 4(1 − 4n ) [7n − ] 3 1−4 4 7n 4n+1 + − . 9 3 9

1 → 0 and so Sn → 1. (n + 1)!

1 1 1 1 − (1 − ) = − (n + 1)! n! n! (n + 1)! 1 n = [n + 1 − 1] = . (n + 1)! (n + 1)!

un = Sn − Sn−1

Page 1153, Table of Contents

=1−

www.EconsPhDTutor.com

Answer to Exercise 311 (9740 N2014/II/3). 10 4) + ⋅ ⋅ ⋅ + 10 × (4 + 4) = 88 × = 440 metres. 2

(b) (8 + 8n) ×

(i) (a) (4 + 4) + 2 × (4 + 4) + 3 × (4 +

n n metres. Set (8 + 8n) × = 5000 and solve: 4n2 + 4n − 5000 = 0 ⇐⇒ 2 2 n2 + n − 1250 = 0 ⇐⇒ n=

−1 ±

√

√ 12 − 4(1)(−1250) = −0.5 ± 0.5 5001. 2

√ The negative root can be ignored. The positive root is −0.5 + 0.5 5001 ≈ 34.859. Hence, the athlete needs to complete at least 35 stages. (ii) After completing n stages, he has run 8 + 2 × 8 + 2 × 16 + 2 × 32 + . . . = 8 (1 + 2 + 4 + 8 + . . . ) n 1 − 2n = 8 ∑ 2k−1 = 8 × 1−2 k=1 = 2n+3 − 8 metres.

ln 10008 − 3 ≈ 10.288. So at the instant when he has run Set 2n+3 − 8 = 10000 and solve: n = ln 2 exactly 10 km, he is in the midst of running his 11th stage. We know that after completing 10 stages, he has run 213 − 8 = 8184 m. So at the current instant, he has completed 1816 m of the 11th stage.

The 11th stage is 8 × 210 = 8192 m long, which means he has not yet completed even half of the 11th stage. So he is now 1816 m from O, running away from O.

Page 1154, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 312 (9740 N2013/I/7). (i) The 1st piece is 128 cm. The 2nd n−1 is 2/3 × 128 cm. ... The nth is p = (2/3) × 128 cm. Taking ln of both sides of this last equation, we have: ln p = ln [(2/3)

n−1

× 128]

= ln (2/3) + ln 128 = (n − 1) ln (2/3) + ln 27 = (n − 1) [ln 2 − ln 3] + 7 ln 2 = (n + 6) ln 2 + (−n + 1) ln 3. n−1

(ii) The total length of string cut off approaches (i.e. keeps getting closer to, but never reaches) 384 cm: 2 1 2 2 2 3 128 128 + ( ) × 128 + ( ) × 128 + ( ) × 128 + . . . → = 384. 3 3 3 1 − 2/3 (iii) If n pieces are cut off, the total length cut off is 128 [1 − (2/3) ] 2 2 2 3 2 n 2 1 . 128 + ( ) × 128 + ( ) × 128 + ( ) × 128 + ⋅ ⋅ ⋅ + ( ) × 128 = 3 3 3 3 1 − 2/3 n

We want to find n such that this last expression equals 380:

128 [1 − (2/3) ] n = 380 ⇐⇒ 384 [1 − (2/3) ] = 380 1 − 2/3 95 95 n n ⇐⇒ 1 − (2/3) = ⇐⇒ 1− = (2/3) 96 96 1 1 n ⇐⇒ = (2/3) ⇐⇒ ln = n ln (2/3) 96 96 ln(1/96) ⇐⇒ =n Ô⇒ n ≈ 11.257. ln(2/3) n

So 12 pieces must be cut off before the total length cut off is greater than 380 cm.

Page 1155, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 313 (9740 N2013/I/9). proposition that

(i) Step #1. Let P(k) stand for the

1 ∑ r(2r2 + 1) = k(k + 1)(k 2 + k + 1). 2 r=1 k

Our goal is to show that P(k) is true for all k = 1, 2, 3, . . . Step #2. Verify that P(1) is true. 1

∑ r(2r2 + 1) = 1(2 × 12 + 1)

r=1

=3 1 = × 1 × (1 + 1)(12 + 1 + 1). 2 1 ✓ = k(k + 1)(k 2 + k + 1). 2

Step #3. Show that P(j) implies P(j + 1) (for all j = 1, 2, 3, . . . ). Assume that P(j) is true. That is,

1 ∑ r(2r2 + 1) = j(j + 1)(j 2 + j + 1). 2 r=1 j

Our goal is to show that P(j + 1) is true. That is,

1 ∑ r(2r2 + 1) = (j + 1) [(j + 1) + 1] [(j + 1)2 + (j + 1) + 1] . 2 r=1

j+1

To this end, write j+1

j

∑ r(2r + 1) = ∑ r(2r2 + 1) + (j + 1) [2 (j + 1) + 1]

r=1

2

1

2

r=1

2 1 2 = j(j + 1)(j 2 + j + 1) + (j + 1) [2 (j + 1) + 1] 2 5 1 = (j + 1) [j(j 2 + j + 1) + 4(j + 1)2 + 2] 2 1 6 = (j + 1) (j 3 + 5j 2 + 9j + 6) 2 4 1 = (j + 1) (j + 2) (j 2 + 3j + 3) 2 1 3 = (j + 1) [(j + 1) + 1] [(j + 1)2 + (j + 1) + 1] , as desired. 2

Page 1156, Table of Contents

www.EconsPhDTutor.com

(ii)

f (r) − f (r − 1)

= (2r3 + 3r2 + r + 24) − [2 (r − 1) + 3 (r − 1) + (r − 1) + 24] 3

2

= (2r3 + 3r2 + r + 24) − [2 (r3 − 3r2 + 3r − 1) + 3 (r2 − 2r + 1) + (r − 1) + 24]

= − [2 (−3r2 + 3r − 1) + 3 (−2r + 1) + (−1)]

= 2 (3r2 − 3r + 1) + 3 (2r − 1) + 1 = 2 (3r2 + 1) + 3 (−1) + 1 = 2 (3r2 + 1) − 2 = 6r2 . n

Next, ∑ r2 = 12 + 22 + 32 + ⋅ ⋅ ⋅ + n2 r=1

f (n) − f (n − 1) f (1) − f (0) f (2) − f (1) f (3) − f (2) + + + ⋅⋅⋅ + 6 6 6 6 3 2 f (n) − f (0) 2r + 3r + r + 24 − 24 = = 6 6 3 2 2n + 3n + n n(2n + 1)(n + 1) = . = 6 6 =

(iii) f (r) = 2r3 + 3r2 + r + 24 = r(2r2 + 1) + 3r2 + 24. Hence, ∑ f (r) = ∑ [r(2r2 + 1) + 3r2 + 24] n

n

r=1

r=1 n

n

n

r=1 n

r=1 n

r=1

= ∑ r(2r2 + 1) + ∑ 3r2 + ∑ 24

= ∑ r(2r2 + 1) + 3 ∑ r2 + 24n r=1

r=1

1 n(2n + 1)(n + 1) = n(n + 1)(n2 + n + 1) + 3 + 24n, 2 6

left unsimplified, as instructed.

Page 1157, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 314 (9740 N2012/I/3). (i) u2 =

3(2) − 1 5 3(5/6) − 1 1 = and u3 = = . 6 6 6 4

(ii) As n → ∞, un+1 − un → 0 ⇐⇒

3un − 1 −3un − 1 1 − un = → 0 ⇐⇒ un → − . 6 6 3

(iii) Step #1. Let P(k) stand for the proposition that 14 1 k 1 uk = ( ) − . 3 2 3

Our goal is to show that P(k) is true for all k = 1, 2, 3, . . . Step #2. Verify that P(1) is true.

14 1 1 1 7 1 ( ) − = − = 2.✓ 3 2 3 3 3

Step #3. Show that P(j) implies P(j + 1) (for all j = 1, 2, 3, . . . ). Assume that P(j) is true. That is,

14 1 j 1 uj = ( ) − . 3 2 3

Our goal is to show that P(j + 1) is true. That is,

14 1 j+1 1 ( ) − . uj+1 = 3 2 3

To this end, write:

1 1 3 [ 14 3un − 1 3 (2) − 3] − 1 uj+1 = = 6 6 1 j 1 j 14 ( 2 ) − 1 − 1 14 ( 2 ) − 2 = = 6 6 j 14 1 1 1 14 1 j+1 1 = ( )( ) − = ( ) − , as desired. 3 2 2 3 3 2 3 j

Page 1158, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 315 (9740 N2012/II/4). (i) Let January 2001 be the 1st month. Then on the nth month, Mrs A’s account has 100 + (100 + 1 × 10) + (100 + 2 × 10) + ⋅ ⋅ ⋅ + n n [100 + (n − 1) × 10] = [200 + (n − 1) × 10] × = (190 + 10n) × = 5n2 + 95n. 2 2 Set 5n + 95n = 5000 and solve: n = 2

−95 ±

√

952 − 4(5)(−5000) = −9.5 ± 33.019. 10

We can ignore the negative root. We have −9.5 + 33.019 = 23.519. So Mrs A’s account first became greater than $5000 on the 24th month — that is, on December 1 2002.

(ii) On the last day of the 1st month, Mr A’s account has 100 × (1.005) dollars. On the last day of the 2nd month, it has [100 × (1.005) + 100] × 1.005 = 100 × 1.0052 + 100 × 1.005 dollars. Etc. So on the nth month, Mr A’s account has 1 − 1.005n 1 − 1.005 = 20100 × (1.005n − 1) .

100 × 1.005n + 100 × 1.005n−1 + ⋅ ⋅ ⋅ + 100 × 1.005 = 100 × 1.005 × Set 20100 × (1.005n − 1) = 5000 and solve:

20100 (1.005n − 1) = 5000 ⇐⇒ 1.005n − 1 =

⇐⇒ n ln 1.005 = ln

50 251 ⇐⇒ 1.005n = 201 201

251 251 ⇐⇒ n = ln ÷ ln 1.005 Ô⇒ n ≈ 44.541. 201 201

So Mr B’s account first became greater than $5000 on the 45th month — that is, in September 2004. (iii) Let 100r be the monthly percentage interest rate. Then on the second day of the 36th month, Mr B’s account has (note that the initial $100 has only earned interest 35 times, while the last $100 deposited hasn’t earned any interest) 1 − (1 + r)36 (1 + r)36 − 1 100(1 + r) + 100(1 + r) + ⋅ ⋅ ⋅ + 100(1 + r) + 100 = 100 = 100 . 1 − (1 + r) r 35

34

1

Set 100r × [(1 + r)36 − 1] = 5000 and solve: 100 ×

(1 + r)36 − 1 (1 + r)36 − 1 = 5000 ⇐⇒ = 50. r r

r ≈ 0.01796 (calculator). And so the required monthly interest rate is ≈ 1.796%. Page 1159, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 316 (9740 N2011/I/6).

(i)

sin (r + 0.5) θ = sin rθ cos(0.5θ) + cos(rθ) sin(0.5θ), sin (r − 0.5) θ = sin rθ cos(0.5θ) − cos(rθ) sin(0.5θ). Ô⇒ sin (r + 0.5) θ − sin (r − 0.5) θ = 2 cos(rθ) sin(0.5θ).

(ii) cos(rθ) =

sin (r + 0.5) θ − sin (r − 0.5) θ . And so 2 sin(0.5θ)

n sin (r + 0.5) θ − sin (r − 0.5) θ 0.5 = ∑ cos(rθ) = ∑ ∑ [sin (r + 0.5) θ − sin (r − 0.5) θ] 2 sin(0.5θ) sin(0.5θ) r=1 r=1 r=1 n

=

=

n

0.5 [sin(1.5θ) − sin(0.5θ) + ⋅ ⋅ ⋅ + sin(n + 0.5)θ − sin(n − 0.5)θ] sin(0.5θ) sin(n + 0.5)θ 0.5 [− sin(0.5θ) + sin(n + 0.5)θ] = −0.5 + . sin(0.5θ) sin(0.5θ)

(iii) Step #1. Let P(k) stand for the proposition that k

∑ sin(rθ) =

r=1

cos(0.5θ) − cos(k + 0.5)θ . 2 sin(0.5θ)

Our goal is to show that P(k) is true for all k = 1, 2, 3, . . .

Step #2. Verify that P(1) is true. Note that sin (0.5 + 0.5) θ = sin(0.5θ) cos(0.5θ) + cos(0.5θ) sin(0.5θ) = 2 sin(0.5θ) cos(0.5θ) 1

∑ sin(rθ) = sin θ 1

r=1

2 sin θ sin(0.5θ) 2 sin(0.5θ) 3 −2 sin [0.5(2θ)] sin [0.5(−θ)] = 2 sin(0.5θ)

= 5

−2 sin θ sin(−0.5θ) 2 sin(0.5θ) 2 cos(0.5θ) − cos(1 + 0.5)θ = . ✓ 2 sin(0.5θ) = 4

To get from = to =, I used the fact that cos P − cos Q = −2 sin [0.5(P + Q)] sin [0.5(P − Q)], which is actually printed on your List of Formulae! (So here’s another exam tip: Whenever you see trigonometric functions and are stuck, go look up the List of Formulae.) 2

3

(... Answer continued on the next page ...) Page 1160, Table of Contents

www.EconsPhDTutor.com

(... Answer continued from the previous page ...) Step #3. Show that P(j) implies P(j + 1) (for all j = 1, 2, 3, . . . ). Assume that P(j) is true. That is, j

∑ sin(rθ) =

r=1

cos(0.5θ) − cos(j + 0.5)θ . 2 sin(0.5θ)

Our goal is to show that P(j + 1) is true. That is, j+1

∑ sin(rθ) =

r=1

To this end, write: j+1

cos(0.5θ) − cos(j + 1 + 0.5)θ . 2 sin(0.5θ)

j

∑ sin(rθ) = ∑ sin(rθ) + sin [(j + 1)θ] 1

r=1

r=1

= 2

= 3

= 4

= 5

cos(0.5θ) − cos(j + 0.5)θ + sin [(j + 1)θ] 2 sin(0.5θ)

cos(0.5θ) − cos(j + 0.5)θ + 2 sin [(j + 1)θ] sin(0.5θ) 2 sin(0.5θ) cos(0.5θ) − cos(j + 0.5)θ + cos(j + 0.5)θ − cos(j + 1.5)θ 2 sin(0.5θ) cos(0.5θ) − cos(j + 1 + 0.5)θ , 2 sin(0.5θ)

as desired. (Again, to get from = to =, I used the same trigonometric identity as before.) 3

Page 1161, Table of Contents

4

www.EconsPhDTutor.com

Answer to Exercise 317 (9740 N2011/I/9). (i) The depth drilled on the 10th day is 256 − 9 × 7 = 193. The total depth drilled is

37 256+(256−1×7)+(256−2×7)+⋅ ⋅ ⋅+(256 − 36 × 7) = [256 + (256 − 36 × 7)]× = 4810 metres. 2 ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶ =4

(ii) The theoretical maximum is

256 = 2304 metres. The depth drilled at the end 1 − 8/9

1 − (8/9)n . We want the latter quantity to be more than 99% of 1 − 8/9 the former. That is, we want 1 − (8/9)n > 0.99 or 0.01 > (8/9)n or ln 0.01 > n ln(8/9) or n > ln 0.01 ÷ ln(8/9) ≈ 39.099. So it takes 40 days. of the nth day is 256

Answer to Exercise 318 (9740 N2010/I/3). (i) un = Sn − Sn−1 = n(2n + c) − (n − 1) [2(n − 1) + c] = 2n2 + cn − 2n2 + 4n − 2 − cn + c = 4n − 2 + c. (ii) We know that

So un+1 = un + 4.

Page 1162, Table of Contents

un = 4n − 2 + c, un+1 = 4(n + 1) − 2 + c.

www.EconsPhDTutor.com

Answer to Exercise 319 (9740 N2010/II/2). (i) Step #1. Let P(k) stand for the proposition that 1 ∑ r(r + 2) = k(k + 1)(2k + 7). 6 r=1 k

Our goal is to show that P(k) is true for all k = 1, 2, 3, . . . Step #2. Verify that P(1) is true. 1

1 ∑ r(r + 2) = 1(1 + 2) = 3 = (1)(1 + 1)(2 × 1 + 7). 6 r=1

Step #3. Show that P(j) implies P(j + 1) (for all j = 1, 2, 3, . . . ).

✓

Assume that P(j) is true. That is,

1 ∑ r(r + 2) = j(j + 1)(2j + 7). 6 r=1 j

Our goal is to show that P(j + 1) is true. That is,

1 ∑ r(r + 2) = (j + 1) [(j + 1) + 1] [2(j + 1) + 7] . 6 r=1

j+1

To this end, write:

j+1

j

∑ r(r + 2) = ∑ r(r + 2) + (j + 1) [(j + 1) + 2]

r=1

1

r=1

1 = j(j + 1)(2j + 7) + (j + 1) [(j + 1) + 2] 6 5 1 = (j + 1) {j(2j + 7) + 6 [(j + 1) + 2]} 6 6 1 = (j + 1) (2j 2 + 7j + 6j + 18) 6 7 1 = (j + 1) (2j 2 + 13j + 18) 6 3 1 = (j + 1)(j + 2)(2j + 9) 6 2 1 = (j + 1) [(j + 1) + 1] [2(j + 1) + 7] . 6 4

as desired. Page 1163, Table of Contents

www.EconsPhDTutor.com

(ii) (a) Observe that 1 0.5 0.5 = − . r(r + 2) r r+2

Hence,

1 r=1 r(r + 2) n n 0.5 0.5 1 1 = ∑( − ) = 0.5 ∑ ( − ) r r + 2 r r + 2 r=1 r=1 1 1 1 1 1 1 1 1 1 1 − )+( − )] = 0.5 [( − ) + ( − ) + ( − ) + ⋅ ⋅ ⋅ + ( 1 3 2 4 3 5 n−1 n+1 n n+2 1 1 1 1 3 1 1 = 0.5 [ + − − ]= − − , as desired. 1 2 n+1 n+2 4 2(n + 1) 2(n + 2) n

∑

(b) As n → ∞, 0.5(n + 1)−1 → 0 and 0.5(n + 2)−1 → 0. So as n → ∞, 1 3 1 1 3 = − − → . 4 2(n + 1) 2(n + 2) 4 r=1 r(r + 2) n

∑

Page 1164, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 320 (9740 N2009/I/3). (i) 1 2 1 n(n + 1) − 2(n − 1)(n + 1) + (n − 1)n − + = n−1 n n+1 (n − 1)n(n + 1) n2 + n − 2(n2 − 1) + n2 − n = n(n2 − 1) 2 −2(−1) = 3 . = 3 n −n n −n

n 1 1 2 1 (ii) ∑ 3 = 0.5 ∑ ( − + ) r r+1 r=2 r − r r=2 r − 1 1 2 1 1 2 1 1 2 1 = 0.5[ ( − + ) + ( − + ) + ( − + ) 1 2 3 2 3 4 3 4 5 1 2 1 1 2 1 1 2 1 − + )] + ( − + ) + ( − + ) + ⋅⋅⋅ + ( 4 5 6 5 6 7 n−1 n n+1 1 1 1 ). = 0.5 ( − + 2 n n+1 n

(iii) As n → ∞,

1 1 → 0 and → 0. So as n → ∞, n n+1 n

∑

r=2

Page 1165, Table of Contents

r3

1 1 1 1 1 1 = 0.5 ( − + ) → 0.5 ( ) = . −r 2 n n+1 2 4

www.EconsPhDTutor.com

Answer to Exercise 321 (9740 N2009/I/5). (i) k 1 Step #1. Let P(k) stand for the proposition that ∑ r2 = k(k + 1)(2k + 1). 6 r=1

Our goal is to show that P(k) is true for all k = 1, 2, 3, . . . 1

1 Step #2. Verify that P(1) is true: ∑ r2 = 1 = (1)(1 + 1)(2 × 1 + 1). ✓ 6 r=1

Step #3. Show that P(j) implies P(j + 1) (for all j = 1, 2, 3, . . . ). Assume that P(j) is true. That is,

1 ∑ r2 = j(j + 1)(2j + 1). 6 r=1 j

Our goal is to show that P(j + 1) is true. That is,

1 ∑ r2 = (j + 1) [(j + 1) + 1] [2(j + 1) + 1] . 6 r=1

j+1

To this end, write: j+1

j

∑ r = ∑ r2 + (j + 1)2

r=1

2 1

r=1

4 1 = j(j + 1)(2j + 1) + (j + 1)2 6 6 1 = (j + 1) (2j 2 + j + 6j + 6) 6 3 1 = (j + 1)(j + 2)(2j + 3) 6

5 1 = (j + 1) [j(2j + 1) + 6(j + 1)] 6 7 1 = (j + 1) (2j 2 + 7j + 6) 6 2 1 = (j + 1) [(j + 1) + 1] [2(j + 1) + 1] , as desired. 6

2n 2n n 1 1 ∑ r2 = ∑ r2 − ∑ r2 = 2n(2n + 1) [2(2n) + 1] − n(n + 1)(2n + 1) 6 6 r=n+1 r=1 r=1

(ii) =

n(2n + 1) n(2n + 1) [2(4n + 1) − (n + 1)] = (7n + 1) . 6 6

Page 1166, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 322 (9740 N2009/I/8). (i) 20 × r24 = 5 ⇐⇒ r = (0.25)1/24 . In 20 the limit, the total length of all the bars is ≈ 356.343 cm. And so indeed, no matter 1−r how many bars there are, the total length must be less than 357 cm. (ii) The total length is L = 20

1 − r25 ≈ 272.257 cm. 1−r

The length of the 13th bar is 20 × r12 = 20 × (0.25)12/24 = 20 ×

√ 0.25 = 20 × 0.5 = 10 cm.

(iii) The total length is L = 5 + (5 + d) + (5 + 2d) + ⋅ ⋅ ⋅ + (5 + 24d) = (10 + 24d) × d=

2L/25 − 10 ≈ 0.491 cm. 24

25 . So 2

The longest bar has length 5 + 24d ≈ 16.781 cm.

Page 1167, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 323 (9740 N2008/I/2).

The nth term of a sequence is given by un = n(2n + 1), for n ≥ 1. The sum of the first n terms is denoted by Sn . Use the method of mathematical induction to show that Sn = n(n + 1)(4n + 5) for all positive integers n. [5] 6 Step #1. Let P(k) stand for the proposition that 1 Sk = k(k + 1)(4k + 5). 6

Our goal is to show that P(k) is true for all k = 1, 2, 3, . . . Step #2. Verify that P(1) is true.

S1 = u1 = 1(2 × 1 + 1) = 3 1 = 1(1 + 1)(4 × 1 + 5).✓ 6

Step #3. Show that P(j) implies P(j + 1) (for all j = 1, 2, 3, . . . ). Assume that P(j) is true. That is,

1 Sj = j(j + 1)(4j + 5). 6

Our goal is to show that P(j + 1) is true. That is,

1 Sj+1 = (j + 1) [(j + 1) + 1] [4(j + 1) + 5] . 6

To this end, write:

Sj+1 = Sj + uj+1 4 1 = j(j + 1)(4j + 5) + (j + 1) [2(j + 1) + 1] 6 5 1 = (j + 1) [j(4j + 5) + 6(2j + 3)] 6 6 1 = (j + 1) (4j 2 + 5j + 12j + 18) 6 1 7 = (j + 1) (4j 2 + 17j + 18) 6 3 1 = (j + 1)(j + 2)(4j + 9) 6 1 2 = (j + 1) [(j + 1) + 1] [4(j + 1) + 5] , as desired. 6 1

Page 1168, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 324 (9740 N2008/I/10). (i) In the 1st month (Jan 2009), she has saved 10 dollars. In the 2nd (Feb 2009), she has saved 10 + (10 + 1 × 3) dollars. So in the nth month, she has saved 10 + (10 + 1 × 3) + (10 + 2 × 3) + . . . [10 + (n − 1) × 3] = n [20 + (n − 1) × 3] × = 8.5n + 1.5n2 dollars. Set 8.5n + 1.5n2 = 2000 and solve: 2 √ −8.5 ± 8.52 − 4(1.5)(−2000) −8.5 ± 109.874 = n= 3 3

−8.5 + 109.874 ≈ 33.791. So it is only in the 34th 3 month that she has saved over $2000. That’s 1 October 2011. We can ignore the negative root. So n =

(ii) (a) At the end of 2 years, her original$10 has earned 10 × 1.0224 − 10 ≈ 6.084 dollars in compound interest. (b) At the end of 2 years, the total in her account is:

1 − 1.0224 10 × 1.02 + 10 × 1.02 + 10 × 1.02 + ⋅ ⋅ ⋅ + 10 × 1.02 = 10 × 1.02 × 1 − 1.02 24

23

22

1

or approximately 310.303 dollars.

1 − 1.02n dollars. Set this equal (c) After n complete months, her savings total 10 × 1.02 × 1 − 1.02 to 2000 and solve. ⇐⇒ ⇐⇒

1 − 1.02n 10 × 1.02 × = 2000 1 − 1.02

⇐⇒

1 − 1.02n = 2000 ×

1 + 2000 ×

0.02 = 1.02n 10.2

−0.02 10.2

n = ln (1 + 2000 × ≈ 80.476.

−0.02 ) ÷ ln(1.02) 10.2

So it is only at the end of 81 complete months that her total savings first exceed $2000.

Page 1169, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 325 (9233 N2008/I/14). (i) Step #1. Let P(k) stand for the proposition that 1 + 2x + 3x + ⋅ ⋅ ⋅ + kx 2

k−1

1 − (k + 1)xk + kxk+1 = . (1 − x)2

Our goal is to show that P(k) is true for all k = 1, 2, 3, . . . Step #2. Verify that P(1) is true.

1 − (1 + 1)x1 + 1 × x1+1 1 − 2x + x2 = =1 (1 − x)2 (1 − x)2

✓

Step #3. Show that P(j) implies P(j + 1) (for all j = 1, 2, 3, . . . ). Assume that P(j) is true. That is,

1 + 2x + 3x + ⋅ ⋅ ⋅ + jx 2

j−1

1 − (j + 1)xj + jxj+1 = . (1 − x)2

Our goal is to show that P(j + 1) is true. That is, 1 + 2x + 3x2 + ⋅ ⋅ ⋅ + (j + 1)xj =

To this end, write:

= 1

= 3

= 4

= 5

= 6

= 2

1 − (j + 2)xj+1 + (j + 1)xj+2 . (1 − x)2

1 + 2x + 3x2 + ⋅ ⋅ ⋅ + (j + 1)xj 1 − (j + 1)xj + jxj+1 + (j + 1)xj (1 − x)2 1 − (j + 1)xj + jxj+1 + (1 − x)2 (j + 1)xj (1 − x)2 1 − (j + 1)xj + jxj+1 + (1 − 2x + x2 )(j + 1)xj (1 − x)2 1 − (j + 1)xj + jxj+1 + (j + 1)xj − 2(j + 1)xj+1 + (j + 1)xj+2 (1 − x)2 1 + jxj+1 − 2(j + 1)xj+1 + (j + 1)xj+2 (1 − x)2 1 − (j + 2)xj+1 + (j + 1)xj+2 , as desired. (1 − x)2

Page 1170, Table of Contents

www.EconsPhDTutor.com

(ii) By the First Fundamental Theorem of Calculus, d [∫ (1 + 2x + 3x2 + ⋅ ⋅ ⋅ + nxn−1 ) dx] = 1 + 2x + 3x2 + ⋅ ⋅ ⋅ + nxn−1 . dx

But

=

=

= =

=

= =

d [∫ (1 + 2x + 3x2 + ⋅ ⋅ ⋅ + nxn−1 ) dx] dx d [x + x2 + x3 + ⋅ ⋅ ⋅ + xn + c] dx d x(1 − xn ) [ + c] dx 1−x (1 − x) [(1 − xn ) + x (−nxn−1 )] − x(1 − xn )(−1) (1 − x)2 (1 − x) [1 − xn − nxn ] + x(1 − xn ) (1 − x)2 (1 − x) [1 − (n + 1)xn ] + x(1 − xn ) (1 − x)2 1 − (n + 1)xn − x + (n + 1)xn+1 + x − xn+1 (1 − x)2 1 − (n + 1)xn + nxn+1 , (1 − x)2

Page 1171, Table of Contents

(c ∈ R)

(geometric series) (quotient rule)

as desired.

www.EconsPhDTutor.com

Answer to Exercise 326 (9233 N2008/II/2). Let the AP have common difference d and the GP have common ratio r. Then from the information given, (0.5 + d) + (0.5r) = 0.5, ⇐⇒ d + 0.5r = 0, 1 2 3 (0.5 + 2d) + (0.5r2 ) = . ⇐⇒ 2d + 0.5r2 = − , 8 8 1

Take = minus 2× = to get 0.5r2 − r = −3/8 or 4r2 − 8r + 3 = 0. And so 2

1

r=

8±

=1±

√

√

82 − 4(4)(3) 2(4)

1 − 3/4 = 0.5, 1.5.

Since the geometric progression is convergent, it must be that ∣r∣ < 1 and so r = 0.5. Hence, 0.5 its sum to infinity is simply = 1. 1 − 0.5

Page 1172, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 327 (9740 N2007/I/9). (i) α ≈ 0.619 and β ≈ 1.512 (calculator). (ii) If the sequence converges to some limit L, then

Ô⇒ ⇐⇒

So L is either α or β.

1 xn e − xn+1 → 0 3 1 L e −L=0 3 eL − 3L = 0.

(iii) If x1 = 0, then x2 = 1/3, x3 ≈ 0.465, x4 ≈ 0.531, x5 ≈ 0.567, x6 ≈ 0.588, . . . , x15 ≈ 0.619. So the sequence converges to α ≈ 0.619.

If x1 = 1, then x2 = 0.906, x3 ≈ 0.825, x4 ≈ 0.761, x5 ≈ 0.713, x6 ≈ 0.680, . . . , x17 ≈ 0.619. So the sequence converges to α ≈ 0.619. If x1 = 2, then x2 = 2.463, x3 ≈ 3.913, x4 ≈ 16.690, x5 ≈ 5903230.335, ... So the sequence diverges.

(iv) We are given the graph of y = ex −3x, which shows that if α < xn < β, then ex −3x < 0 or equivalently 1/3ex − x < 0. And if x < α or x > β, then ex − 3x > 0 or equivalently 1/3ex − x > 0.

Hence,

Equivalently,

⎧ ⎪ ⎪ 1 xn ⎪ < 0, xn+1 − xn = e − xn ⎨ ⎪ 3 ⎪ ⎪ ⎩ > 0, xn+1 < xn xn+1 > xn

if α < xn < β,

if xn < α or xn > β.

if α < xn < β, ifxn < α or xn > β, as desired.

(v) If xn is ever greater than β, then the sequence will diverge.

If xn is smaller than α, then we cannot make any firm conclusion based on the results in part (iv). The sequence may or may not converge. This is because if xn < α, then there is always the possibility that xn+1 > β, whereupon the sequence will diverge.

Page 1173, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 328 (9740 N2007/I/10). (i) From the information given, the first term of the geometric series is a and moreover ar − a , 3 ar − a ar2 = a + 5d Ô⇒ ar2 = a + 5 3 2 ⇐⇒ 3r = 3 + 5(r − 1) 2 ⇐⇒ 3r − 5r + 2 = 0. ar = a + 3d ⇐⇒

d= 1

(ii) 3r2 − 5r + 2 = (3r − 2)(r − 1) = 0 and so r = 2/3 or r = 1. But if r = 1, then from =, we have d = 0, contradicting our assumption that d ≠ 0. Hence, r = 2/3. Since ∣r∣ < 1, the a = 3a. geometric series is convergent and has sum to infinity 1−r 1

(iii) From = and the fact that r = 2/3, d = −a/9. So 1

Set S > 4a and solve:

⇐⇒

S = a + (a + d) + (a + 2d) + ⋅ ⋅ ⋅ + [a + (n − 1) d] n = [2a + (n − 1) d] × 2 n n−1 a] × = [2a − 9 2 n n 19 = ( a − a) × . 9 9 2

(

⇐⇒ ⇐⇒

⇐⇒

19 n n a − a) × > 4a 9 9 2 (

19 n − )n > 8 9 9

(19 − n) n > 72

0 > n2 − 19n + 72

n=

19 ±

√

√ 192 − 4(1)(72) 19 ± 73 = . 2 2

n2 − 19n + 72 is a ∪-shaped quadratic with zeros 0.5 [19 ± So n2 − 19n + 72 < 0 ⇐⇒ n ∈ {6, 7, 8, 9, 10, 11, 12, 13}. Page 1174, Table of Contents

√

192 − 4(1)(72)] = 5.228, 13.772. www.EconsPhDTutor.com

Answer to Exercise 329 (9740 N2007/II/2). (i) Step #1. Let P(k) stand for the proposition that uk = 1/k 2 . Our goal is to show that P(k) is true for all k = 1, 2, 3, . . . Step #2. Verify that P(1) is true: u1 = 1 = 1/12 . ✓

Step #3. Show that P(j) implies P(j + 1) (for all j = 1, 2, 3, . . . ). Assume that P(j) is true. That is, uj = 1/j 2 .

Our goal is to show that P(j + 1) is true. That is, uj+1 =

To this end, write:

uj+1 = uj −

2j + 1 + 1)2

=

j 2 (j

2j + 1 1 ] = 2 [1 − j (j + 1)2 =

=

(ii)

1 . (j + 1)2

1 2j + 1 − 2 2 j j (j + 1)2

1 (j + 1)2 − (2j + 1) = 2[ ] j (j + 1)2

1 j2 1 j 2 + 2j + 1 − (2j + 1) [ ] = [ ] j2 (j + 1)2 j 2 (j + 1)2 1 , (j + 1)2

as desired.

N 2n + 1 = ∑ (un − un+1 ) ∑ 2 2 n=1 n=1 n (n + 1) = (u1 − u2 ) + (u2 − u3 ) + (u3 − u4 ) + ⋅ ⋅ ⋅ + (uN − uN +1 ) 1 = u1 − uN +1 = 1 − . (N + 1)2 N

2n + 1 1 = 1 − → 1. 2 (n + 1)2 2 n (N + 1) n=1 N

(iii) As N → ∞, (N + 1)−2 → 0 and so ∑ (iv) Observe that

Hence,

2n + 1 2(n + 1) − 1 = . n2 (n + 1)2 (n + 1)2 (n + 1 − 1)2

N −1 N −1 2n − 1 2(n + 1) − 1 2n + 1 1 = = = 1 − . ∑ ∑ ∑ 2 2 2 (n + 1 − 1)2 2 (n + 1)2 2 n (n − 1) (n + 1) n N n=2 n=1 n=1 N

Page 1175, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 330 (9233 N2007/I/14). Step #1. Let P(k) stand for the proposition that k

∑ sin rx =

r=1

cos(0.5x) − cos [(k + 0.5)x] . 2 sin(0.5x)

Our goal is to show that P(k) is true for all k = 1, 2, 3, . . . Step #2. Verify that P(1) is true. 1

2 sin x sin(0.5x) 2 sin(0.5x) −2 sin x sin(−0.5x) 1 cos(0.5x) − cos [(1 + 0.5)x] = .✓ = 2 sin(0.5x) 2 sin(0.5x)

∑ sin rx = sin x

r=1

=

= uses the identity cos P − cos Q = −2 sin [0.5 (P + Q)] sin [0.5 (P − Q)]. 1

Step #3. Show that P(j) implies P(j + 1) (for all j = 1, 2, 3, . . . ). Assume that P(j) is true. That is, j

∑ sin rx =

r=1

cos(0.5x) − cos [(j + 0.5)x] . 2 sin(0.5x)

Our goal is to show that P(j + 1) is true. That is, j+1

∑ sin rx =

r=1

To this end, write: j+1

j

r=1

r=1

cos(0.5x) − cos [(j + 1.5)x] . 2 sin(0.5x)

∑ sin rx = ∑ sin rx + sin [(j + 1)x]

cos(0.5x) − cos [(j + 0.5)x] + sin [(j + 1)x] 2 sin(0.5x) cos(0.5x) − cos [(j + 0.5)x] + 2 sin(0.5x) sin [(j + 1)x] = 2 sin(0.5x) cos(0.5x) − cos [(j + 0.5)x] + cos [(j + 0.5)x] − cos [(j + 1.5)x] = 2 sin(0.5x) cos(0.5x) − cos [(j + 1.5)x] = , as desired. 2 sin(0.5x) =

Page 1176, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 331 (9233 N2007/II/1). 2n

2n

r=1

r=1 1

∑ 3r+2 = 32 ∑ 3r

= 9 (3 + 32 + 33 + ⋅ ⋅ ⋅ + 32n ) 3 (1 − 32n ) =9 1−3 27 (1 − 32n ) = −2 27 2n (3 − 1) . = 2

Answer to Exercise 332 (9233 N2006/I/1). S1 = 6 − 2/31−1 = 4 = a S2 = 6 −

1 ⇐⇒ r = . 3

Page 1177, Table of Contents

2

32−1

=

16 = a + ar = 4(1 + r) 3

www.EconsPhDTutor.com

Answer to Exercise 333 (9233 N2006/I/11). (i) k

Step #1. Let P(k) stand for the proposition that ∑ r3 = k 2 (k + 1)2 /4. r=1

Our goal is to show that P(k) is true for all k = 1, 2, 3, . . . 1

Step #2. Verify that P(1) is true: ∑ r3 = 1 = 12 (1 + 1)2 /4. ✓ r=1

Step #3. Show that P(j) implies P(j + 1) (for all j = 1, 2, 3, . . . ). j

Assume that P(j) is true. That is, ∑ r3 = j 2 (j + 1)2 /4. r=1

j+1

Our goal is to show that P(j + 1) is true. That is, ∑ r3 = (j + 1)2 [(j + 1) + 1] /4. 2

r=1

To this end, write:

j 2 1 3 1 r = ∑ ∑ r3 + (j + 1)3 = j 2 (j + 1)2 + (j + 1)3 4 r=1 r=1

j+1

5 1 4 1 = (j + 1)2 [j 2 + 4(j + 1)] = (j + 1)2 (j + 2)2 4 4

3 1 2 = (j + 1)2 [(j + 1) + 1] , 4

as desired.

(ii) 23 + 43 + 63 + ⋅ ⋅ ⋅ + (2n)3 = 23 (13 + 23 + 33 + ⋅ ⋅ ⋅ + n3 ) 1 = 8 [ n2 (n + 1)2 ] 4 2 = 2n (n + 1)2 , as desired. (iii)

n

∑(2r − 1)3 = 13 + 33 + 53 + 73 + ⋅ ⋅ ⋅ + (2n − 1)3

r=1

= [13 + 23 + 33 + 43 + ⋅ ⋅ ⋅ + (2n)3 ] − [23 + 43 + 63 + 83 + ⋅ ⋅ ⋅ + (2n)3 ] 1 = (2n)2 (2n + 1)2 − 2n2 (n + 1)2 4 = n2 (2n + 1)2 − 2n2 (n + 1)2 = n2 [(2n + 1)2 − 2(n + 1)2 ]

= n2 [4n2 + 4n + 1 − 2 (n2 + 2n + 1)]

= n2 (2n2 − 1) . Page 1178, Table of Contents

www.EconsPhDTutor.com

92.3

Answers for Ch. 76: Vectors

Answer to Exercise 334 (9740 N2015/I/7).

Ð→ ÐÐ→ (i) OC = 0.6a and OD = 5/11b.

Ð→ Ð→ Ð→ (ii) BC = OC − OB = 0.6a − b and so the line BC can be written as r = b + λ(0.6a − b) = 0.6λa + (1 − λ)b, for λ ∈ R, as desired.

Ð→ ÐÐ→ Ð→ 5 AD = OD − OA = /11b − a and so the line AD can be written as r = a + µ(5/11b − a) = (1 − µ)a + 5/11µb, for λ ∈ R, as desired.

(iii) Where the lines meet, we have 0.6λa + (1 − λ)b = (1 − µ)a + 5/11µb. Equating the 1 2 1 coefficients, we have 0.6λ = 1 − µ and 5/11µ = 1 − λ. From =, we have µ = 1 − 0.6λ. Plugging 2 this into =, we have 5/11 (1 − 0.6λ) = 1 − λ ⇐⇒ 1 − 0.6λ = 11/5 − 11/5λ ⇐⇒ 8/5λ = 6/5 ⇐⇒ λ = 3/4. And µ = 0.55. Altogether then, the position vector of E is 0.45a + 0.25b.

Ð→ ÐÐ→ Ð→ ÐÐ→ AE = 0.55a − 0.25b and ED = −0.45a + 9/44b. We observe that AE = −9/11ED and so the desired ratio is 9/11.

Page 1179, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 335 (9740 N2015/II/2). (i) The angle is cos−1 (

(2, 3, −6) ⋅ (1, 0, 0) 2 ) = cos−1 ( √ √ ) ≈ 1.281. ∣(2, 3, −6)∣ ∣(1, 0, 0)∣ 49 1

(ii) The vector from P to a generic point on L is (2, 5, −6) − r = (2, 5, −6) − (1, −2, −4) − (2λ, 3λ, −6λ) = (1 − 2λ, 7 − 3λ, −2 + 6λ). The length of this vector is √

(1 − 2λ)2

+ (7 − 3λ)2

+ (−2 + 6λ)2

=

√

49λ2 − 70λ + 54.

49λ2 − 70λ + 54 = 33 ⇐⇒ 49λ2 − 70λ + 21 = 0 ⇐⇒ 7λ2 − 10λ + 3 = 0 ⇐⇒ (7λ − 3)(λ − 1) = 0 ⇐⇒ λ = 3/7, 1.

Hence, the two points are (1, −2, −4)+3/7(2, 3, −6) = 1/7(13, −5, −46) and (1, −2, −4)+(2, 3, −6) = (3, 1, −10).

49λ2 −70λ+54 is a ∪-shaped quadratic with minimum point given by 98λ−70 = 0 or λ = 5/7. Hence, the closest point is (1, −2, −4) + 5/7(2, 3, −6) = 1/7(17, 1, −58).

(iii) The plane is parallel to the vectors (2, 3, −6) and (2, 5, −6) − (1, −2, −4) = (1, 7, −2). It thus has normal vector (2, 3, −6) × (1, 7, −2) = (36, −2, 11). Moreover, we know that (1, −2, −4) is on the plane. Hence, a cartesian equation is 36x − 2y + 11z = 36 × 1 − 2 × (−2) + 11 × (−4) = −4.

Page 1180, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 336 (9740 N2014/I/3). (i) One possibility is that one or both are zero vectors. And if neither is a zero vector, then they point either in the same direction ˆ or a ˆ ˆ=b ˆ = −b. or in the exact opposite directions — or equivalently, either a

(1, 2, −2) . (ii) (1,̂ 2, −2) = 3 (iii) It is

∣(1, 2, −2) ⋅ (0, 0, 1)∣ 2 = . 3×1 3

Answer to Exercise 337 (9740 N2014/I/9). (i) The plane q is parallel to the vectors (1, 2, −3) and (2, −1, 4). It thus has normal vector (1, 2, −3) × (2, −1, 4) = (5, −10, −5) and hence also normal vector (−1, 2, 1). It contains the point (1, −1, 3). Altogether then, it has cartesian equation −x + 2y + z = 0. (ii) Line m has direction vector (−1, 2, 1) × (1, 2, −3) = (−8, −2, −4) and hence also direction vector (4, 1, 2).

To find a point that is on both planes, try plugging in x = 0. Then from the equation of q, we have z = −2y. Now plug this also into the equation of p to get 2y − 3(−2y) = 12 or y = 1.5. Hence, an intersection point is (0, 1.5, −3).

Altogether then, the line m has vector equation r = (0, 1.5, −3) + λ(4, 1, 2), for λ ∈ R.

Ð→ Ð→ Ð→ (iii) AB = OB − OA = (4λ, 1.5 + λ, −3 + 2λ) − (1, −1, 3) = (4λ − 1, 2.5 + λ, −6 + 2λ). So Ð→ 2 ∣AB∣ = (4λ − 1)2 + (2.5 + λ)2 + (−6 + 2λ)2 = 21λ2 − 27λ + 43.25. This lattermost expression is a ∪-shaped quadratic, with minimum point given by 42λ − 27 = 0 or λ = 9/14. So 18 15 12 B = (4λ, 1.5 + λ, −3 + 2λ) = ( , , − ). 7 7 7 Answer to Exercise 338 (9740 N2013/I/1). (i) From the equation for p, we have 1 z = 0.5x − 2. Plug = into the equation for q to get 2x − 2y + 0.5x − 2 = 6 ⇐⇒ y = 1.25x − 4. 1

2

Now plug = and = into the equation for r to get 5x − 4(1.25x − 4) + µ(0.5x − 2) = −9 ⇐⇒ 3 4 4µ − 50 0.5µx + 25 − 2µ = 0 ⇐⇒ x = . µ 1

2

So if µ = 3, from =, =, and =, we have 1

2

4

x=−

38 119 25 ,y = − , and z = − . 3 6 3

(ii) From =, if µ = 0, then we have 250, a contradiction. So the three planes do not intersect. 1

Page 1181, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 339 (9740 N2013/I/6). (i) Every vector in the plane can be expressed as the linear combination of any two vectors with distinct directions (see Fact 23). ÐÐ→ 4a + 3c (ii) By the Ratio Theorem, ON = . 7 (iii) The area of triangle ON C is

ÐÐ→ Ð→ 4a + 3c × c∣ 0.5 ∣ON × OC∣ = 0.5 ∣ 7 = 1/14 ∣(4a + 3c) × c∣ = 1/14 ∣4a × c + 3c × c∣ (distributivity of vector product) = 1/14 ∣4a × c∣ (v × v = 0) = 1/14 ∣4a × (λa + µb)∣ = 1/14 ∣4a × λa + 4a × µb∣ (distributivity of vector product) 2µ = 1/14 ∣4a × µb∣ = . 7

Similarly, the area of triangle OM C is

ÐÐ→ Ð→ 0.5 ∣OM × OC∣ = 0.5 ∣0.5b × c∣

= 1/4 ∣b × c∣ = 1/4 ∣b × (λa + µb)∣ = 1/4 ∣b × λa + b × µb∣ (distributivity of vector product) = 1/4 ∣b × λa∣ (v × v = 0) = 1/4λ ∣b × a∣ .

Altogether then, 2µ/7 = λ/4 or λ = 8µ/7.

Page 1182, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 340 (9740 N2013/II/4). (i) The angle θ between two planes is given by the scalar product of their normal vectors:

So θ = 0.705.

cos θ =

(2, −2, 1) ⋅ (−6, 3, 2) 16 16 16 =√ √ = = . ∣(2, −2, 1)∣ ∣(−6, 3, 2)∣ 9 49 3 × 7 21

(ii) The intersection line of two planes has direction vector given by the vector product of their normal vectors: (2, −2, 1) × (−6, 3, 2) = (−7, −10, −6).

A point that is on both planes satisfies both equations 2x − 2y + z = 1 and −6x + 3y + 2z = −1. Plugging in x = 0, the first equation yields z = 1 + 2y, which when plugged into the second equation yields y = −3/7. So a point that is on both planes is (0, −3/7, 1/7).

ˆ ∣, where n is its normal (iii) The distance between a point a and a plane is given by ∣d − a ⋅ n ˆ. vector and d = r ⋅ n

For p1 , d = 1/3 and for p2 = d = −1/7. Hence, the distance between A(4, 3, c) and the plane p1 is 1 2+c 1+c 1 (4, 3, c) ⋅ (2, −2, 1) ∣=∣ − ∣ = ∣− ∣, ∣ − 3 3 3 3 3

and the distance between A(4, 3, c) and the plane p2 is

1 (4, 3, c) ⋅ (−6, 3, 2) 1 15 − 2c 14 − 2c ∣− − ∣ = ∣− + ∣=∣ ∣. 7 7 7 7 7

Equating these two distances, we have

1 + c 14 − 2c = ⇐⇒ −7 − 7c = 42 − 6c ⇐⇒ c = −49, 3 7 1 + c 14 − 2c OR = ⇐⇒ 7 + 7c = 42 − 6c ⇐⇒ c = 35/13. 3 7 −

Page 1183, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 341 (9740 N2012/I/5). (i) The area of triangle OAC is Ð→ Ð→ 0.5 ∣OA × OC∣ = 0.5 ∣a × (λa + µb)∣

= 0.5 ∣a × λa + a × µb∣ (distributivity of vector product) = 0.5 ∣a × µb∣ (v × v = 0) = 0.5µ ∣(1, −1, 1) × (1, 2, 0)∣ √ = 0.5µ ∣(−2, 1, 3)∣ = 0.5 14µ.

√ √ √ √ So 0.5 14µ = 126 or µ = 2 × 126/14 = 2 × 9 = 2 × 3 = 6.

√ (ii) c = λa+µb = λa+4b = λ(1, −1, 1)+4(1, 2, 0) = (4+λ, 8−λ, λ). We are given that ∣c∣ = 5 3. √ 2 So (4 + λ)2 + (8 − λ)2 + λ2 = 3λ2 − 8λ + 80 = (5 3) = 75 ⇐⇒ 3λ2 − 8λ + 5 = 0 = (3λ − 5)(λ − 1), so λ = 5/3 or 1. And c = (52/3, 61/3, 5/3) or (5, 7, 1). Answer to Exercise 342 (9740 N2012/I/9). (i) r = (7, 8, 9) + λ(8, 16, 8).

→⋅v ˆ) v ˆ , where v is the direction vector of the (ii) The position vector of N is given by p + (Ð pa (1, 2, 1) ̂ line, p is a point on the line, and a is the given point (1, 8, 3). Compute (8, 16, 8) = √ 6 and now: (7, 8, 9) +

(−6, 0, −6) ⋅ (1, 2, 1) (1, 2, 1) 12 √ √ = (7, 8, 9) − (1, 2, 1) = (5, 4, 7). 6 6 6

By the Ratio Theorem, the ratio AN ∶ N B = α ∶ 1 satisfies

(7, 8, 9) + α(−1, −8, 1) α+1 1 = (7 − α, 8 − 8α, 9 + α) . α+1

(5, 4, 7) =

7−α Solving 5 = , we have 5α + 5 = 7 − α or α = 1/3. So the ratio AN ∶ N B = α ∶ 1 = 1/3 ∶ 1 = α+1 1 ∶ 3.

Page 1184, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 343 (9740 N2011/I/7). (i) m = 0.5 (p + q) = 1/6a + 3/10b. The area of triangle OM P is 0.5 ∣m × p∣ = 0.5 ∣0.5 (p + q) × p∣ = 0.25 ∣p × p + q × p∣ (distributivity of vector product) = 0.25 ∣q × p∣ (v × v = 0) = 0.25 ∣3/5b × 1/3a∣ = 0.05 ∣b × a∣ = 0.05 ∣a × b∣ . (ii) (a) Since a is a unit vector, (2p)2 + (6p)2 + (3p)2 = 1 or 49p2 = 1 or p = 1/7.

(b) ∣a ⋅ b∣ is the length of the projection vector of b on a. (c) a × b = 1/7(2, −6, 3) × (1, 1, −2) = 1/7(9, 7, 8).

Page 1185, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 344 (9740 N2011/I/11). (i) A normal vector to the plane is (4 − (−2), −1 − (−5), −3 − 2) × (4 − 4, −1 − (−3), −3 − (−2)) = (6, 4, −5) × (0, 2, −1) = (6, 6, 12).

Another normal vector to the plane is a scalar multiple of the above, namely (1, 1, 2). We have (4, −1, −3) ⋅ (1, 1, 2) = −3. Hence, a cartesian equation of p is x + y + 2z = −3. (ii) From the equations for l1 , we have y−2 2 = z + 3 ⇐⇒ y = −4z − 10. −4

x−1 1 = z + 3 ⇐⇒ x = 2(z + 3) + 1 = 2z + 7 and 2

Plug in = and = into the equations for l2 to get 1

2

2z + 7 + 2 3 −4z − 10 − 1 4 z − 3 . = = 1 5 k

From =, we have 10z + 45 = −4z − 11 ⇐⇒ z = −56/14 = −4. Now from =, we have z−3 −7 k=5 = 5 = −7. −4z − 11 5 3

4

(iii) The direction vector of l1 is perpendicular to the normal vector of the plane p, as we can verify — (2, −4, 1) ⋅ (1, 1, 2) = 0. Moreover, a point on l1 is on p, as we can verify — (1, 2, −3) ⋅ (1, 1, 2) = −3. Altogether then, l1 is on p.

From the equations for l2 , we have y = 5x+11 and z = −7x−11. Plug these into the equation for the plane p to get: x + (5x + 11) + 2(−7x − 11) = −3 ⇐⇒ −8x − 11 = −3 ⇐⇒ x = −1. So y = 6 and z = −4. The intersection point is (−1, 6, −4). (iv) The angle θ between l2 and the normal vector to p is given by θ = cos−1 (

(1, 5, −7) ⋅ (1, 1, 2) −4 −8 ) = cos−1 ( √ √ ) = cos−1 ( √ ) ≈ 1.957. ∣(1, 5, −7)∣ ∣(1, 1, 2)∣ 75 4 5 2

So the acute angle between l2 and p is 2.172 − π/2 ≈ 0.387.

Page 1186, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 345 (9740 N2010/I/1). (22 + 32 + 62 ) p2 = 49p2 . So p = 3/7.

(i) ∣b∣ = 1 + 22 + 22 = 9 = ∣a∣ = 2

2

6 9 18 6 9 18 (ii) (a + b) ⋅ (a − b) = ( + 1, + 2, + 2) ⋅ ( − 1, − 2, − 2) 7 7 7 7 7 7 13 115 128 =− − + = 0. 49 49 49

(Optional. Actually, more generally, since (a + b) ⋅ (a − b) = ∣a∣ − ∣b∣ , if ∣a∣ = ∣b∣, then (a + b) ⋅ (a − b) = 0.) 2

2

Answer to Exercise 346 (9740 N2010/I/10). (i) The line has direction vector (−3, 6, 9), which is a scalar multiple of the plane’s normal vector (1, −2, −3). So the line is perpendicular to the plane. (ii) From the equations of the line, we have y = −2x + 19 and z = −3x + 27. Plug these in to the equation of the plane to get x − 2(−2x + 19) − 3(−3x + 27) = 0 ⇐⇒ 14x − 119 = 0 ⇐⇒ x = 119/14 = 8.5. And so y = 2 and z = 1.5. So the point of intersection is (8.5, 2, 1.5). (iii) We can easily verify that the given point satisfies the equations for the line: 23 + 1 33 + 3 4= = . The point is therefore on the line. 6 9

−2 − 10 = −3

The point of intersection we found in (ii) (call it X) is equidistant to both A and B. Moreover, these three points are collinear. Thus, B = (19, −19, −30). (iv) The area of triangle OAB is 0.5 ∣a × b∣ = 0.5 ∣(−2, 23, 33) × (19, −19, −30)∣ = 0.5 ∣(−63, 567, −399)∣ ≈ 348.

Page 1187, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 347 (9740 N2009/I/10). (i) The angle θ between the two planes is given by θ = cos−1 (

(2, 1, 3) ⋅ (−1, 2, 1) 3 ) = cos−1 ( √ √ ) ≈ 1.237. ∣(2, 1, 3)∣ ∣(−1, 2, 1)∣ 14 4

(ii) The line l has direction vector (2, 1, 3) × (−1, 2, 1) = (−5, −5, 5) and thus also direction vector (1, 1, −1).

A point (x, y, z) that lies on both planes satisfies 2x+y +3z = 1 and −x+2y +z = 2. Plugging 1 2 in x = 0, = yields y = 1 − 3z and now = yields z = 0. So (x, y, z) = (0, 1, 0). 1

2

Altogether then, the line l has vector equation r = (0, 1, 0) + λ(1, 1, −1), for λ ∈ R.

(iii) The line l is parallel to the plane p3 , as we now verify: (1, 1, −1) ⋅ (2 − k, 1 + 2k, 3 + k) = 2 − k + 1 + 2k − 3 − k = 0. Moreover, the point (0, 1, 0), which is on the line l, is also on the plane p3 , as we now verify: 2 × 0 + 1 + 3 × 0 − 1 + k(−0 + 2 × 0 + 0 − 2) = 0. Altogether then, the line l lies in p3 for any constant k.

We want to find k such that (2, 3, 4) satisfies 2x + y + 3z − 1 + k(−x + 2y + z − 2) = 0. That is, 2 × 2 + 3 + 3 × 4 − 1 + k(−2 + 2 × 3 + 4 − 2) = 18 + 6k. So k = −3. So the plane is 2x + y + 3z − 1 − 3(−x + 2y + z − 2) = 0 or 5x − 5y + 5 = 0 or x − y + 1 = 0. Ð→ Answer to Exercise 348 (9740 N2009/II/2). (i) Let p = OP . By the Ratio Theorem, 2(11, −13, 2) + (14, 14, 14) p= = (12, −4, 6). So the point P is (12, −4, 6). 3

Ð→ (ii) AB ⋅ p = (−3, −27, −12) ⋅ (12, −4, 6) = 0.

(6, −2, 3) (6, −2, 3) ̂ (iii) c = (12, −4, 6) = (6,̂ −2, 3) = √ = . ∣a ⋅ c∣ is the length of the 7 62 + (−2)2 + 32 projection vector of a on p. (iv) a × p = (140, 84, −224). ∣a × p∣ is the area of the parallelogram formed with a and p as its sides,√ where the heads of a and p are the same point. The area of the triangle OAP is ∣a × p∣ = 1402 + 842 + (−224)2 ≈ 139. Page 1188, Table of Contents

www.EconsPhDTutor.com

Ð→ Ð→ Ð→ Answer to Exercise 349 (9740 N2008/I/3). (i) OP = OA + OB = (6, 3, −3). Ð→ Ð→ (ii) The angle AOB is equal to the angle between the vectors OA and OB: cos−1 (

1 (1, 4, −3) ⋅ (5, −1, 0) ) = cos−1 ( √ √ ) ≈ 1.532. ∣(1, 4, −3)∣ ∣(5, −1, 0)∣ 26 26

Ð→ Ð→ (iii) It is ∣OA × OB∣ ≈ 25.981.

Answer to Exercise 350 (9740 N2008/I/11). You can either find the intersection point using a graphing calculator or painfully by hand, as I do now: 5 2 1 1 From p1 , z = 1 − x + y. Plug = into the equation for p2 to get 3x + 2y − 5z = 3x + 2y − 3 3 2 5 19 19 2 1 3 5 (1 − x + y) = −5 or x − y = 0 or x = y. And so from =, we have z = 1 + x. Now plug 3 3 3 3 3 2 in = and = into the equation for p3 to get 5x + λx + 17(1 + x) = µ or (22 + λ)x = µ − 17 or 0.4 4 4 4 7 µ − 17 =− = − . So the point of intersection is (− , − , ). x= 22 + λ 1.1 11 11 11 11 (i) The line has direction vector (2, −5, 3) × (3, 2, −5) = (19, 19, 19) and thus also direction vector (1, 1, 1). From our work above, x = y at the intersection of the two planes. Plug in x = 0 to find that the two planes intersect at (0, 0, 1). Altogether then, the line has vector equation r = (0, 0, 1) + α(1, 1, 1), for α ∈ R. (ii) Two points on the line are (0, 0, 1) and (−1, −1, 0). Plug these into the equation for plane p3 to get 17 = µ and −5 − λ = µ, so that µ = −22.

(iii) The line l must be parallel to the plane p3 , so that (1, 1, 1) ⋅ (5, λ, 17) = 0 or λ = −22. Moreover, the point (0, 0, 1) on the line is not on the plane, so that µ ≠ 17.

(iv) Another vector that is parallel to the plane to be found is (1, −1, 3)−(0, 0, 1) = (1, −1, 2). The plane thus has normal vector (1, 1, 1) × (1, −1, 2) = (3, −1, −2). Compute also d = (0, 0, 1) ⋅ (3, −1, −2) = −2. Altogether then, the plane has cartesian equation 3x − y − 2z = −2. Page 1189, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 351 (9233 N2008/I/11). (i) From the equations of the first line, we have y = 2x − 2 and z = 5 − x. Plugging these into the equations of the second line, we have x − 1 1 2x − 2 + 3 2 5 − x − 4 = = . −1 −3 1

Both = and = imply that x = 4 and so indeed the two lines intersect. (If they didn’t intersect, 1 2 then = would contradict =.) So the point of intersection is (4, 6, 1). 1

2

(ii) The angle between the lines is given by cos−1 (

(1, 2, −1) ⋅ (−1, −3, 1) −8 ) = cos−1 ( √ √ ) ≈ 2.967. ∣(1, 2, −1)∣ ∣(−1, −3, 1)∣ 4 11

This is obtuse. So the acute angle is π − 2.967 ≈ 0.175.

Answer to Exercise 352 (9740 N2007/I/6). (i) (1, −1, 2) ⋅ (2, 4, 1) = 0. ÐÐ→ (ii) By the Ratio Theorem, OM = 1/3 [2(1, −1, 2) + (2, 4, 1)] = 1/3(4, 2, 5).

(iii) The area of triangle OAC is Ð→ Ð→ 0.5 ∣OA × OC∣ = 0.5 ∣(1, −1, 2) × (−4, 2, 2)∣ = 0.5 ∣(−6, −10, −2)∣ √ = 0.5 140 ≈ 5.916.

Page 1190, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 353 (9740 N2007/I/8). (i) The line l has vector equation r = (1, 2, 4) + λ(−3, 1, −3), for λ ∈ R. Plugging this into the equation for the plane, we have 3(1 − 3λ) − (2 + λ) + 2(4 − 3λ) = 17 ⇐⇒ 9 − 16λ = 17 ⇐⇒ λ = −0.5. So the point of intersection is (1, 2, 4) − 0.5(−3, 1, −3) = (2.5, 1.5, 5.5). (ii) The angle between l and the normal vector to p is cos−1 (

(−3, 1, −3) ⋅ (3, −1, 2) −16 ) = cos−1 ( √ √ ) ≈ 2.946. ∣(−3, 1, −3)∣ ∣(3, −1, 2)∣ 19 14

So the angle between the line and the plane is 2.946 − π/2 ≈ 1.376.

√ 8 4 14 ∣17 − (1, 2, 4) ⋅ (3, −1, 2)∣ ∣17 − 9∣ ˆ∣ = √ = √ =√ = ≈ 2.138. (iii) ∣d − a ⋅ n 7 14 14 14 Answer to Exercise 354 (9233 N2007/I/7). The foot of the perpendicular a point A Ð→ ˆ) v ˆ , where Q is any point on the line and v is the line’s direction to a line is Q + (QA ⋅ v vector. Hence, (4, −5, −5) ⋅ (2, 2, 3) (2, 2, 3) √ √ 17 17 17(2, 2, 3) = (−3, 8, 3) − = (−5, 6, 0). 17

P = (−3, 8, 3) + √ Ð→ ∣AP ∣ = ∣(−6, 3, 2)∣ = 49 = 7.

Page 1191, Table of Contents

www.EconsPhDTutor.com

ÐÐ→ Answer to Exercise 355 (9233 N2007/II/2). (i) OD = 0.75(1, −3, 4). So the line AD has direction vector (3.25, 3.25, 0) and hence also direction vector (1, 1, 0). So the line AD has equation r = (4, 1, 3) + λ(1, 1, 0), for λ ∈ R.

Ð→ (ii) OC = 0.25(4, 1, 3). So the line BC has direction vector (0, 3.25, −3.25) and hence also direction vector (0, 1, −1). So the line BC has equation r = (1, −3, 4) + µ(0, 1, −1), for µ ∈ R.

Setting the equations of the two lines equal to each other, we have 4 + λ = 1, 1 + λ = −3 + µ, and 3 = 4 − µ, so that λ = −3 and µ = 1. And the point of intersection is (1, −2, 3). Ð→ Ð→ Answer to Exercise 356 (9233 N2006/I/14). By the Ratio Theorem, OP = (1−λ)OA+ Ð→ Ð→ Ð→ ÐÐ→ λOB = (1 − λ)(1, −2, 5) + λ(1, 3, 0) = (1, −2 + 5λ, 5 − 5λ). And OQ = (1 − µ)OC + µOD = (1 − µ)(10, 1, 2) + µ(−2, 4, 5) = (10 − 12µ, 1 + 3µ, 2 + 3µ).

Ð→ ÐÐ→ (i) P Q has direction vector AB × CD = (0, 5, −5) × (−12, 3, 3) = (30, 60, 60) and hence also direction vector (1, 2, 2).

Ð→ Ð→ Ð→ Moreover, P Q = OQ − OP = (9 − 12µ, 3 + 3µ − 5λ, −3 + 3µ + 5λ), which must be a scalar 1 2 multiple of (1, 2, 2). And so 3 + 3µ − 5λ = 2(9 − 12µ) and −3 + 3µ + 5λ = 2(9 − 12µ). Taking 2 1 2 1 = minus =, we have −6 + 10λ = 0 or λ = 0.6. Taking = plus =, we have 6µ = 4(9 − 12µ) or Ð→ µ = 2/3. Altogether then, P Q = (1, 2, 2), as desired.

Ð→ Ð→ Ð→ (ii) First observe that AQ = OQ − OA = (10 − 12µ, 1 + 3µ, 2 + 3µ) − (1, −2, 5) = (2, 3, 4) − (1, −2, 5) = (1, 5, −1). Now compute that the area of triangle ABQ is

Ð→ Ð→ 0.5 ∣AB × AQ∣ = 0.5 ∣(0, 5, −5) × (1, 5, −1)∣

= 0.55 ∣(20, −5, −5)∣ ≈ 10.607.

Page 1192, Table of Contents

www.EconsPhDTutor.com

92.4

Answers for Ch. 77: Complex Numbers

Answer to Exercise 357 (9740 N2015/I/9). (a) w2 (a + ib)2 a2 − b2 + 2abi a2 − b2 + 2abi a + ib = = = × w∗ a − ib a − ib a − ib a + ib =

=

a3 − ab2 + 2a2 bi + a2 ib − ib3 − 2ab2 a2 + b2

1 [(a3 − 3ab2 ) + i (3a2 b − b3 )] 2 2 a +b

√ √ is purely imaginary if and only if a3 −3ab2 = 0. But a3 −3ab2 = a(a2 −3b2 ) = a (a − 3b) (a + 3b). √ So either b = ±a/ 3 or a = 0 (but the latter is explicitly ruled out in the question). √ Altogether, the possible values of w = a + ib are given by b = ±a/ 3 and a is any non-zero real number. (b) (i) z 5 = 25 eiπ(−0.5) = 25 eiπ(−0.5+2k) for k ∈ Z. So z = 2eiπ(−0.5+2k)/5 for k = 0, ±1, ±2. So ∣z∣ = 2 and arg z = −0.9π, −0.5π, −0.1π, 0.3π, 0.7π. z1 − z2 = 2eiπ(0.7) − 2eiπ(−0.9)

(ii)

= 2eiπ(−0.1) (eiπ(0.8) + eiπ(−0.8) ) = 2eiπ(−0.1) 2i sin 0.8π

= (4 sin 0.8π) ieiπ(−0.1) .

(Fact 53)

So arg (z1 − z2 ) = arg [(4 sin 0.8π) ieiπ(−0.1) ] and

= arg (4 sin 0.8π) + arg i + arg (eiπ(−0.1) ) + 2kπ = 0 + π/2 − 0.1π + 2kπ = 0.4π (k = 0),

∣z1 − z2 ∣ = ∣(4 sin 0.8π) ieiπ(−0.1) ∣

= ∣4 sin 0.8π∣ ∣i∣ ∣eiπ(−0.1) ∣ = 4 sin 0.8π = 4 sin 0.2π, ®´¹¹ ¹ ¹ ¹ ¹ ¹ ¸ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶ 1 1

where the last line uses the fact that sin(π − x) = sin x. Page 1193, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 358 (9740 N2014/I/5). (i) z 2 = (1 + 2i)2 = 12 + (2i)2 + 2(1)(2i) = 1 − 4 + 4i = −3 + 4i.

z 3 = (−3 + 4i)(1 + 2i) = −3 − 6i + 4i + (4i)(2i) = −3 − 2i − 8 = −11 − 2i. So

1 1 1 −11 + 2i −11 + 2i −11 + 2i −11 + 2i = = × = = = . z 3 −11 − 2i −11 − 2i −11 + 2i 112 − (2i)2 121 + 4 125

q −11 + 2i 11 2 (ii) Since pz 2 + 3 = p(−3 + 4i) + q = (−3p − q) + i (4p + q) is real, we have z 125 125 125 q q = 0 or q = −250p. And pz 2 + 3 = 19p. 4p + 2 125 z

Page 1194, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 359 (9740 N2014/II/4). (a) (i) This is simply a circle with 1 radius 4 centred on the point −5 + i. It has cartesian equation (x + 5)2 + (y − 1)2 = 42 .

y

{z

: |z + 5 - i| = 4}

Radius 4 (-5, 1)

x

(ii) The complex equation ∣z − 6i∣ = ∣z + 10 + 4i∣ is equivalent to the cartesian equation 2 (x − 0)2 + (y − 6)2 = (x + 10)2 + (y + 4)2 or −12y + 36 = 20x + 100 + 8y + 16 or y = −x − 4.

So to find the intersection points of the line and the circle, plug = into = to get (x + 5)2 + √ √ (−x − 4 − 1)2 = 42 or 2(x + 5)2 = 42 or (x + 5)2 = 8 or x + 5 = ± 8 or x = −5 ± 8. So the √ √ √ √ possible values of z are −5 ± 8 + (5 ∓ 8 − 4) i = −5 ± 8 + (1 ∓ 8) i. √ √ √ 2 −1 π (b) (i) w = 3−i, so ∣w∣ = ( 3) + (−1)2 = 2 and arg w = tan−1 √ = − . So w = 2ei(−π/6) . 6 3 6 6 i(−π+2kπ) 6 iπ And so w = 2 e =2 e . wn (ii) arg ( ∗ ) = arg wn − arg w∗ + 2kπ = n arg w + arg w + 2kπ = (n + 1) arg w + 2kπ = (n + w 1) × (−π/6) + 2kπ. A complex number z is real if and only if arg z = 0 or arg z = π. So by wn observation, the three smallest positive whole number values of n for which ∗ is real are w 5, 11, and 17. 2

Page 1195, Table of Contents

1

www.EconsPhDTutor.com

Answer to Exercise 360 (9740 N2013/I/4). (i) (1 + 2i)3 = 1 + 3 × 2i + 3 × (2i)2 + (2i)3 = 1 + 6i − 12 − 8i = −11 − 2i. (ii) Since w = 1 + 2i is a root for az 3 + 5z 2 + 17z + b = 0, we have

0 = a (1 + 2i) + 5 (1 + 2i) + 17 (1 + 2i) + b = a(−11 − 2i) + 5(−3 + 4i) + 17 + 34i + b = (−11a − 15 + 17 + b) + i(−2a + 20 + 34) = (2 − 11a + b) + i(54 − 2a). ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶ ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¶ 3

0

2

0

54 − 2a = 0 Ô⇒ a = 27. And 2 − 11a + b = 0 Ô⇒ b = 11(27) − 2 = 295.

(iii) By the complex conjugate roots theorem, 1 − 2i is also a root for the equation. Write 27z 3 + 5z 2 + 17z + 295 = 27 [z − (1 + 2i)] [z − (1 − 2i)] (z − k) = 27 [(z − 1)2 − (2i)2 ] (z − k) = 27 (z 2 − 2z + 5) (z − k)

= 27 [z 3 − (k + 2)z 2 − (2k + 5)z − 5k] .

So 295 = 27 × (−5k) or k = −59/27. The roots are 1 ± 2i and −59/27.

Page 1196, Table of Contents

www.EconsPhDTutor.com

√ √ Answer to Exercise 361 (9740 N2013/I/8). (i) ∣w∣ = ∣(1 − i 3) z∣ = ∣1 − i 3∣ ∣z∣ = √ √ 2 √ √ √ 12 + ( 3) ∣z∣ = 2 ∣z∣. arg w = arg [(1 − i 3) z] = arg (1 − i 3)+arg z+2kπ = tan−1 (− 3)+ θ + 2kπ = −π/3 + θ + 2kπ ∈ [−π/3 + 2kπ, π/6 + 2kπ]. So we should choose k = 0 and arg w = −π/3 + θ. (ii) z is the top-right quarter of the circumference of the circle of radius r, centred on the origin. Take the position vector of z, rotate it clockwise by π/3 radiant about the origin, double its length — this is the position vector of w.

y re iπ / 2

2re i π / 6 re i 0 x

{z = re iɅ : Ʌ

[0, π / 2]}

2re i (- π / 3)

z 10 π (iii) arg ( 2 ) = arg z 10 − arg w2 + 2kπ = 10 arg z − 2 arg w + 2kπ = 10θ − 2 (− + θ) + 2kπ = w 3 π π 8θ + 2 + 2kπ = π, so θ = (with k = 0). 3 24

Page 1197, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 362 (9740 N2012/I/6). (i) z 3 = (1+ic)3 = 13 +3(ic)+3(ic)2 +(ic)3 = 1 + 3ic − 3c2 − ic3 = (1 − 3c2 ) + i(3c − c3 ).

√ (ii) z is√real if and only√if 3c − c = 0 or c = 0, ± 3. The question already ruled out c = 0. So c = ± 3 and z = 1 ± i 3. 3

3

√ (iii) z = 1 − i 3 = ∣z∣ ei arg z = 2ei(−π/3) . ∣z n ∣ = 2n > 1000 if and only if n > 9. (The reason is that 29 = 512 and 210 = 1024.) So the smallest positive integer n is 10. ∣z 10 ∣ = 210 and arg z 10 = 10(−π/3) + 2kπ = 2π/3 (k = 2).

Page 1198, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 363 (9740 N2012/II/2). (i) ∣z − (7 − 3i)∣ = 4 describes a circle with centre 7 − 3i and radius 4. y

{z

: |z - (7 - 3i )| = 4} x

o a Radius 4 c = (7, -3) b Radius 4 d

(ii) (a) a is the point on the circle’s circumference that is closest to the origin a. The line l through the origin and the centre of the circle passes through a (see Fact 56). √ √ But the distance of the centre of the circle from the origin is 72 + 32 = 58. The distance of the centre of the circle to the point a is 4 √ (this is simply the length of the radius). Hence, the distance of the origin to the point a is 58 − 4. (b) △abc is right. So ab2 + bc2 = ca2 = 42 = 16.

3 (because it runs through the origin and the point 7 − 3i) and 7 3 3 2 49 7 28 so ab = bc. Hence, ( ) × bc2 + bc2 = 16. Or bc2 = 16 × . Or bc = 4 × √ = √ . And 7 7 58 58 58 12 28 12 ab = √ . Hence, a = (7 − √ , −3 + √ ). 58 58 58

But the line l has slope −

(iii) By observation, d is the point where ∣arg z∣ is as large as possible. arg z = arg(7 − 3i) + ∠cod. 4 −3 But △cod is right. So ∠cod = sin−1 √ . Moreover, arg(7 − 3i) = tan−1 . 7 58 Altogether then, arg z = tan−1

Page 1199, Table of Contents

−3 4 + sin−1 √ = −0.9579. 7 58

www.EconsPhDTutor.com

Answer to Exercise 364 (9740 N2011/I/10). (i) Let (x + iy)2 = x2 − y 2 + i(2xy) = −8i. 1 2 2 So x2 − y 2 = 0 and 2xy = −8. From =, we observe that x and y must have opposite signs. 1 From =, x = ±y and by our observation of the previous sentence, we must have x = −y. And 2 now from =, we have 2(−y) × y = −8 or −2y 2 = −8 or y = ±2. Altogether then, z1 = −2 + 2i and z2 = 2 − 2i. (ii) Using the quadratic formula and part (i), w=

−4 ±

√

√ √ 42 − 4(1)(4 + 2i) = −2 ± 4 − (4 + 2i) = −2 ± −2i = −2 ± (1 + i) = −3 − i, −1 + i. 2

(iii) (a) This is simply the line that is equidistant to z1 = (−2, 2) and z2 = (2, −2). By observation, it has cartesian equation y = x.

y

|z - z1 | = |z - z2 | |w - w1 | = |w - w2 | x

(b) This simply the line that is equidistant to w1 = (−3, −1) and w2 = (−1, 1). By observation, it has cartesian equation y = x + 2.

(iv) The two lines are parallel and do not intersect. Page 1200, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 365 (9740 N2011/II/1). (i) This is simply the circle with radius 3 and centre 2 + 5i, including all the points within the circle.

y

{z

: |z - (2 + 5i )| ≤ 3} b P Radius 3 c = (2, 5) P2 Radius 3 a P1 (6, 1) x

(ii) The points on the circle’s circumference that are closest to and furthest from the origin o are a and b. The line l through the origin and the centre of the circle passes through both a and b (see Fact 56). √ √ √ √ oc = 22 + 52 = 29 and ac = 3. Hence, oc = 29 − 3. √ Symmetrically, ob = 29 + 3. The maximum and minimum possible values of ∣z∣ are thus 29 ± 3.

(iii) The locus of points that satisfy both ∣z − 2 − 5i∣ ≤ 3 and 0 ≤ arg z ≤ π/4 is the blue closed segment. By observation, ∣z − 6 − i∣ is maximised either at P1 or P2 . These points are given by (p − 2)2 + (p − 5)2 = 3 ⇐⇒ 2p2 − 14p + 20 = 0 ⇐⇒ p2 − 7p + 10 = 0 ⇐⇒ (p − 5)(p − 2) = 0.

So P1 = (2,√2) and P2√ = (5, 5). The distances of these points to the point (6, 1) are √ 17 and 12 + 42 = 17. So both are, equally, the furthest point from (6, 1). Page 1201, Table of Contents

√ 42 + 12 =

www.EconsPhDTutor.com

√ √ 2 Answer to Exercise 366 (9740 N2010/I/8). (i) For z1 , r = 12 + ( 3) = 2 and √ √ √ −1 −3π 2 −1 θ = tan ( 3/1) = π/3. For z2 , r = (−1)2 + (−1) = 2 and θ = tan−1 = . −1 4 √ π π −3π −3π Altogether then, z1 = 2 [cos + i sin ] and z2 = 2 [cos + i sin ]. 3 3 4 4 (ii) ∣

√ z1 ∣z1 ∣ 2 z1 −3π 13π −11π ∣= = √ = 2. arg ( ) = arg z1 − arg z2 = π/3 − = = . z2 ∣z2 ∣ z2 4 12 12 2

11π z1 ∗ √ 11π + i sin ]. Hence, ( ) = 2 [cos z2 12 12

(iii) (a) This is simply the circle with centre z1 and radius 2.

y |z - z 1| = 2 arg (z - z 2) = π / 4

z 1= (1,

)

x

z 2 = (-1, -1)

(b) This is simply the ray from the point z2 (but excluding the point z2 ) that makes an angle π/4 with the horizontal. √ (iv) We want to find x > 0 such that ∣(x, 0) − (1, 3)∣ = 2 or (x − 1)2 + 3 = 4 or (x − 1)2 = 1 or x = 0, 2. So (2, 0) is where the locus ∣z − z1 ∣ = 2 meets the positive real axis. Page 1202, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 367 (9740 N2010/II/1). (i) x=

6±

√

(−6)2 − 4(1)(34) =3± 2

√

−100 = 3 ± 5i. 2

(ii) Since −2 + i is a root of x4 + 4x3 + x2 + ax + b = 0, we have

(−2 + i)4 + 4(−2 + i)3 + (−2 + i)2 + a(−2 + i) + b = 0 ⋮ (tedious algebra) −12 − 2a + b + (16 + a)i = 0. ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶ ´¹¹ ¹ ¹ ¹ ¹ ¸¹ ¹ ¹ ¹ ¹ ¹ ¶ =0 =0

16 + a = 0 Ô⇒ a = −16. Moreover, −12 − 2a + b = 0 Ô⇒ −12 − 2(−16) + b = 0 Ô⇒ b = −20. By the complex conjugate roots theorem, −2 − i is also a root. So

x4 + 4x3 + x2 − 16x − 20 = [x − (−2 + i)] [x − (−2 − i)] (x2 + cx + d) = (x + 2 − i) (x + 2 + i) (x2 + cx + d) = [(x + 2) − i2 ] (x2 + cx + d) 2

= (x2 + 4x + 5) (x2 + cx + d) = x4 + (4 + c)x3 + (5c + 4d)x + 5d.

Comparing coefficients, we have c = 0 and d = −4. So x2 + cx + d = x2 − 4 = (x − 2)(x + 2). So the other two roots are ±2.

Page 1203, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 368 (9740 N2009/I/9). (i) z 7 = 1 + i = 21/2 eiπ/4 = 21/2 eiπ(1/4+2k) . By de Moivre’s Theorem, z = 21/14 eiπ(1/28+2k/7) , for k = 0, ±1, ±2, ±3.

y

x

(ii)

(iii) ∣z − z1 ∣ = ∣z − z2 ∣ is the line (blue) that is equidistant to the points z1 = 21/14 eiπ(1/28) and z2 = 21/14 eiπ(1/28+2/7) Explanation #1: 0 satisfies the equation ∣z − z1 ∣ = ∣z − z2 ∣ as we can easily verify — ∣0 − z1 ∣ = ∣0 − z2 ∣ = 21/14 . So 0 is in the locus ∣z − z1 ∣ = ∣z − z2 ∣.

Explanation #2: The perpendicular bisector of a chord runs through the centre of the circle. So in this case, the perpendicular bisector of the chord z1 z2 runs through the origin (which is the centre of the circle).

Page 1204, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 369 (9740 N2008/I/8). (i) (1 +

√

2

3i) (1 +

√ √ √ √ √ 3i) = (−2 + 2 3i) (1 + 3i) = −2 − 6 + (2 3 − 2 3) i = −8.

(ii) 0 = 2z 3 + az 2 + bz + 4 √ √ = 2(−8) + a (−2 + 2 3i) + b (1 + 3i) + 4 √ = −12 − 2a + b + i 3 (2a + b) ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶ ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶ 1 =0

=0 2

Adding = and = together, we have −12 + 2b = 0 or b = 6. And now from =, a = −3. 1

2

(iii) By the complex conjugate roots theorem, another root is 1 −

2

√

3i. So

√ √ 2z 3 − 3z 2 + 6z + 4 = 2 [z − (1 + 3i)] [z − (1 − 3i)] (z − c) √ √ = 2 (z − 1 − 3i) (z − 1 + 3i) (z − c) √ 2 2 = 2 [(z − 1) − ( 3i) ] (z − c) = 2 (z 2 − 2z + 4) (z − c)

= 2 [z 3 + (−c − 2)z 2 + (4 + 2c)z − 4c] .

Comparing coefficients, we have c = −0.5, which is also the third root for the equation. y

x

Page 1205, Table of Contents

www.EconsPhDTutor.com

w ∣w∣ Answer to Exercise 370 (9740 N2008/II/3). (a) ∣p∣ = ∣ ∗ ∣ = ∗ = 1 and arg p = w ∣w ∣ w arg ∗ = arg w − arg w∗ = θ − (−θ) = 2θ. w

arg p5 = 10θ + 2kπ. The argument of a positive real number is 2mπ for some integer m. Hence, θ = nπ/5 for integers n. Given also the restriction that θ ∈ (0, π/2), we have θ = π/5 or 2π/5. (b) ∣z∣ ≤ 6 is a circle of radius 6 centred on the origin, including the interior of the circle.

∣z∣ = ∣z − 8 − 6i∣ is a line that is equidistant to the origin and the point (8, 6).

So the locus of z is the line segment AB.

y 8 + 6i

|z | ≤ 6

A

C

O

B x

|z | = |z - 8 - 6i |

(i) (ii) Observe that arg z is maximised and minimised at A and B. arg A = ∠COX + ∠AOC, 6 3 arg B = ∠COX − ∠BOC. Moreover, ∠COX = arg(8 + 6i) = tan−1 = tan−1 . 8 4 √ Note that △AOC is right and the length of OC is half of ∣8 + 6i∣ = 82 + 62 = 10. So OC OC 5 OC = 10. Thus, ∠AOC = ∠BOC = cos−1 = cos−1 = cos−1 . OA OB 4

3 5 Altogether then, arg A = ∠COX + ∠AOC = tan−1 + cos−1 ≈ 1.229 and arg B = ∠COX − 4 4 −1 3 −1 5 ∠BOC = tan − cos ≈ 0.058. 4 4

Page 1206, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 371 (9233 N2008/I/9). (i) This is the circle centred on −2i with radius 2. y

x

- 2i Radius 2

|z + 2i | = 2

(ii) This is the line that is equidistant to the points 2 + i and i. y

x

Page 1207, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 371 (9233 N2008/I/9) (iii). This is the region bounded by and including the rays arg(z + 1 − 3i) = π/6 and arg(z + 1 − 3i) = π/3.

y

π/3

1 + 3i

π/6 π / 6 ≤ arg (z + 1 – 3i) ≤ π / 3 x

Page 1208, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 372 (9233 N2008/II/3). (i) (1 − i)2 = 1 − 1 − 2i = −2i. ✓

The other root is −1 + i, because (−1 + i)2 = 1 − 1 − 2i = −2i.

Remark 10. Do not make the mistake of concluding that by the complex conjugate roots theorem, 1 + i is the other root of the equation w2 = −2i. The theorem applies only for polynomials whose coefficients are all real. It does not apply here because there is an imaginary coefficient. √

(3 + 5i)2 − 4(1)(−4)(1 − 2i) 3 + 5i ± = (ii) z = 2 √ 3 + 5i ± −2i 3 + 5i ± (1 − i) = = = 2 + 2i, 1 + 3i. 2 2 3 + 5i ±

Page 1209, Table of Contents

√

9 − 25 + 30i + 16(1 − 2i) 2

www.EconsPhDTutor.com

Answer to Exercise 373 (9740 N2007/I/3). centred on the point −2 + 3i.

(a) This is the circle with radius

√ 13

y

|z + 2 - 3i | =

-2 + 3i

Radius x

(b) (a + ib)(a − ib) + 2(a + ib) = 3 + 4i or a2 + b2 + 2a + 2bi = 3 + 4i. Two complex numbers are 1 2 equal if and only if their real and imaginary parts are equal. So a2 + b2 + 2a = 3 and 2b = 4. 2 1 From =, b = 2. Plug this into = to find that a2 + 2a + 1 = 0 or a = −1. So w = −1 + 2i.

Page 1210, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 374 (9740 N2007/I/7). (i) By the complex conjugate roots −iθ theorem, another root is re . And so a quadratic factor of P (z) is (z + reiθ ) (z − re−iθ ) = z 2 + rzeiθ − rze−iθ − (reiθ ) (re−iθ )

= z 2 + rz (eiθ − e−iθ ) − r2 = z 2 + rz cos θ − r2 .

(ii) z 6 = −64 = 64eiπ = 26 eiπ(1+2k) for k ∈ Z. So z = 2eiπ(1+2k)/6 for k = 0, ±1, ±2, −3. (iii) First use what we found in part (ii). Then use what we found in part (i):

z 6 + 64

z 2 −2(2) cos(π/6)+22

z 2 −2(2) cos(5π/6)+22

³¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹· ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ µ ³¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹·¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹µ = (z − 2eiπ/6 ) (z − 2e−iπ/6 )(z − 2e3iπ/6 ) (z − 2e−3iπ/6 )(z − 2e5iπ/6 ) (z − 2e−5iπ/6 ) ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¶ z 2 −2(2) cos(3π/6)+22

√ √ = (z 2 − 2 3 + 4) (z 2 + 4) (z 2 + 2 3 + 4) .

Page 1211, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 375 (9233 N2007/I/9). (i) By the complex conjugate roots theorem, another root is −ki. Altogether then, az 4 + bz 3 + cz 2 + dz + e = a (z − ki) (z + ki) (z 2 + f z + g) = a (z 2 + k 2 ) (z 2 + f z + g)

= a [z 4 + f z 3 + (k 2 + g)z 2 + k 2 f z + gk 2 ] .

By comparing coefficients, we have b = af , c = a(k 2 + g), d = ak 2 f , and e = agk 2 . Now verify that indeed: 1

2

ad2 + b2 e = a3 k 4 f 2 + a3 f 2 gk 2

= (af ) × [a(k 2 + g)] × (ak 2 f ) = b × c × d. ✓

(ii)a = 1, b = 3, c = 13, d = 27, e = 36. So indeed ad2 +b2 e = 1×272 +32 ×36 = 1053 = 3×13×27 = bcd. ✓ √ √ √ b d 27 1 2 From = above, f = = 3. So from =, k = ± =± = ± 9 = ±3. So the two desired a af 1×3 roots are ±3i.

Page 1212, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 376 (9233 N2007/II/5). The locus of P is the ray from (but excluding) the point 2i that makes an angle π/3 with the horizontal. This is one half of √ π the line with cartesian equation y = x tan + 2 = 3x + 2. 3 The locus of Q is the line that is equidistant to the points 4 and −2. It has cartesian equation x = 1.

y P : arg (z – 2i) = π / 3

(1, (1, 2)

) π/3

Q : |z + 2| = |z – 4| (-2, 0)

(4, 0)

x

√ √ The intersection of the two lines is (1, 3 + 2) or 1 + i ( 3 + 2).

√ √ √ √ √ 2 [1 + i ( 3 + 2)] [1 − i ( 3 + 2)] = 1 + ( 3 + 2) = 1 + 3 + 4 + 4 3 = 8 + 4 3. ✓

Page 1213, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 377 (9233 N2006/I/5). centred on −4 + 4i.

(i) This is the circle with radius 3

y

|z + 4 - 4i| = 3

A = (-4, 4)

B

C = (0, 1)

x

(ii) Given a point (C here), the line connecting it to the centre of a circle (A here) also passes through the point on the circumference (B here) that is closest to the given point (see Fact 56). √ √ 2 2 The distance between A and C is (−4 − 0) + (4 − 1) = 42 + 32 = 5. So the distance between B and C is 5 − 3 = 2.

Page 1214, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 378 (9233 N2006/I/6).

(i)

0 = (ki)4 − 2(ki)3 + 6(ki)2 − 8(ki) + 8 = k 4 + 2k 3 i − 6k 2 − 8ki + 8 = k 4 − 6k 2 + 8 + 2k(k 2 − 4)i. ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶ ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶ =0 0

From 2k(k 2 − 4) = 0, we have k = 0, ±2. Only k = ±2 also satisfies k 4 − 6k 2 + 8 = 0. So the equation has roots ±2i. (ii) z 4 − 2z 3 + 6z 2 − 8z + 8 = (z − 2i) (z + 2i) (z 2 + az + b)

= (z 2 + 4) (z 2 + az + b) = z 4 + az 3 + (4 + b)z 2 + 4az + 4b.

Comparing coefficients, a = −2 and b = 2. So z 2 + az + b = z 2 − 2z + 2, whose zeros are z=

2±

√

√ (−2)2 − 4(1)(2) = 1 ± 1 − 2 = 1 ± i. 2

Altogether then, the equation has roots ±2i, 1 ± i.

Page 1215, Table of Contents

www.EconsPhDTutor.com

92.5

Answers for Ch. 78: Calculus

Answer to Exercise 379 (9740 N2015/I/3). (i) This question is simply asking you to explain the intuition behind the definite and Riemann integral (see Chapter 479). 1 See figure below. There are 10 rectangles. Each has width . The leftmost blue rectangle 10 1 2 has height f ( ). The second-leftmost rectangle has height f ( ). Etc. The total area 10 10 1 1 1 2 10 of the 10 rectangles is [f ( ) + f ( ) + ⋅ ⋅ ⋅ + f ( )]. It approximates ∫ f (x) dx, 10 10 10 10 0 which is the area under the graph of f , between 0 and 1. By increasing the number of rectangles, we improve the approximation. In the limit, it is plausible that we have 1 1 1 2 n lim { [f ( ) + f ( ) + ⋅ ⋅ ⋅ + f ( )]} = ∫ f (x) dx. n→∞ n n n n 0

y

x 1 √ (ii) Let f ∶ R → R be defined by x ↦ 3 x. Then by part (i), the given expression is equal 1 1√ 1 to ∫ f (x) dx = ∫ 3 x dx = 3/4 [x4/3 ]0 = 3/4. 0 0

Page 1216, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 380 (9740 N2015/I/4). The total perimeter of the two shapes is d = 2(x + y) + 2x + πx. Rearranging, y = 0.5d − x(2 + 0.5π).

The total area of the two shapes is A = xy + 0.5πx2 . The total area is maximised where dA = 0 or dx

So x =

dA dy = y + x + πx = 0.5d − x(2 + 0.5π) − x (2 + 0.5π) + πx = 0.5d − 4x = 0. dx dx

d and 8 2 + 0.5π d 2 d d) + 0.5π ( ) A = (0.5d − 8 8 8 1 1 π π d2 =( − − + ) d2 = . 16 32 128 128 32

Answer to Exercise 381 (9740 N2015/I/6). (i) ln (1 + 2x) = (2x)− 8 2x − 2x2 + x3 + . . . . 3

(2x)2 (2x)3 + +⋅ ⋅ ⋅ = 2 3

(ii) ax (1 + bx)

c

c(c − 1)(bx)2 c(c − 1)(c − 2)(bx)3 + + ...] 2! 3! ab3 c(c − 1)(c − 2) 3 x ... = ax + abcx2 + 0.5ab2 c(c − 1)x3 + 6 = ax [1 + cbx +

1 2 3 8 Comparing coefficients, a = 2, abc = −2, and 0.5ab2 c(c − 1) = . 3

Solve this system of equations using your calculator or manually, as I do now: From = and −1 c−1 8 1 8 −3 5 2 3 =, we have b = . Now from =, we have = or 1 − = or c = . And b = . c c 3 c 3 5 3 1

Altogether then, the coefficient of x4 is

3 5 8 13 ab3 c(c − 1)(c − 2) 2 ( 3 ) (− 5 ) (− 5 ) (− 5 ) 104 = =− . 6 6 27 3

Page 1217, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 382 (9740 N2015/I/10). the origin and π/2 is A1 + A2 = ∫

π/2 0

(i) The area under y = cos x between

cos x dx = [sin x]0

π/2

= 1.

The area between y = cos x and y = sin x between the origin and P is A2 = ∫ 0 π/4 √ π/4 π/4 sin x dx = [sin x]0 − [− cos x]0 = 2 − 1. ∫ 0

So A1 = 2 −

√ 2. And

π/4

cos x dx −

√ √ √ √ A1 2 − 2 2 − 2 2+1 2−2+ 2 √ =√ ×√ = =√ = 2. A2 1 2−1 2−1 2+1

(ii) The volume of the solid is π∫

(iii) π ∫ 0

√ 0.5 2

0

√ 0.5 2

x2 dy = π ∫

0

√ 0.5 2

(sin−1 y) 2 dy.

(sin−1 y) 2 dy

u ′ ⎡ ⎤π/4 v′ v u ⎢ © © π/4 π/4 ©¬ ⎥⎥ ⎢ 2 2¬ 2 u cos u du = π ⎢u sin u − ∫ 2 u sin u du⎥ = π∫ u d (sin u) = π ∫ ⎢ ⎥ 0 0 ⎢ ⎥ ⎣ ⎦0

√ √ π/4 π3 2 π3 2 π/4 = − 2π [u(− cos u) − ∫ (− cos u) du] = + 2π [u cos u − sin u]0 16 2 16 2 0

√ √ √ √ π3 2 π 2 2 π 2 π2 π = + 2π ( − )= ( + − 2) . 16 2 4 2 2 2 16 2

Page 1218, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 383 (9740 N2015/I/11).

(i)

dy dy dx = ÷ = (6 sin θ cos2 θ − 3 sin3 θ) ÷ (3 sin2 θ cos θ) = 2 cot θ − tan θ. dx dθ dθ

√ (ii) dy/dx = 0 ⇐⇒ 2 cot θ − tan θ = 0 ⇐⇒ 2 =√sin2 θ/ cos2 θ = tan2 θ ⇐⇒ tan θ = ± √ 2. So indeed there√is a stationary point when √ tan θ = 2, which corresponds to where sin θ = 2/3 √ and cos θ = 1/3 and where (x, y) = (2 2/3/3, 2/ 3). d dθ −2 csc2 θ − sec2 θ d2 y 2 2 (−2 = (2 cot θ − tan θ) = csc θ − sec θ) = . dx2 dx dx 3 sin2 θ cos θ

√ The numerator is always negative. At tan θ = 2, we have cos θ > 0 and so the denominator √ is negative. Altogether then, at tan θ = 2, the second derivative is negative, so that this is indeed a maximum turning point. (iii) By observation, y ≥ 0 (for 0 ≤ θ ≤ 0.5π) and thus C is entirely above the x-axis. So the desired area is simply θ=0.5π

∫θ=0

y dx = ∫

=∫

θ=0.5π θ=0 0.5π 0

3 sin2 θ cos θ d (sin3 θ) = ∫

0.5π 0

3 sin2 θ cos θ (3 sin2 θ cos θ) dθ

9 sin4 θ cos2 θ dθ ≈ 0.8835729338221293 ≈ 0.884.

(iv) The intersection points of the line and the curve C are given by 3 sin2 θ cos θ = a sin3 θ or 3/a = tan θ, as desired. √ √ √ 2 = 3/a ⇐⇒ a = 3/ 2 = 1.5 2. Answer to Exercise 384 (9740 N2015/II/1). (i) The maximum height is attained when dh/dt = 0 ⇐⇒ h = 32 m. (ii)

√ √ 1 16 − 0.5h t √ dh = 16 − 0.5h. dt ⇐⇒ + C = Ô⇒ t = A − 40 ∫ ∫ 10 0.5 ⋅ (−0.5) 10 16 − 0.5h 1

Since h = 0 when t = 0, we have A = 40(16)0.5 = 160. So t = 160 − 40(16 − 0.5h)0.5 . √ It takes t(h = 16) = 160−40(16−0.5×16)0.5 = 160−40×80.5 = 160−80 2 years after planting for the tree to reach its maximum height. Page 1219, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 385 (9740 N2014/I/2). Apply d/dx to the given equation: 2xy + x2

dy dy + y 2 + x(2y) = 0. dx dx

Plugging in dy/dx = −1 implies that −x2 + y 2 = 0 or y = ±x. What we’ve just shown is that dy/dx = −1 only if y = ±x.

Plug y = −x into the original equation: −x3 + x3 + 54 = 54 = 0 — clearly, this equation is never true. Next plug in y = x: 2x3 + 54 = 0 ⇐⇒ x = −3. So the one and only point at which the gradient is −1 is (−3, −3). Answer to Exercise 386 (9740 N2014/I/7). (i) α ≈ 1.885 (calculator).

The intersection points of √ the curve with the line y√= −7 are given by −7 = x6 − 3x4 − 7 or x4 (x2 − 3) = 0. So x = 0, ± 3. By observation, β = 3. (ii) ∫ f (x) dx ≈ −0.597 (calculator). β α

(iii) The desired area is ∣∫ 0

√

3

f (x) − (−7) dx∣ = − [

x7 7

−

3x5 5

− 7x + 7x]

√

3

0

(iv) f (−x) = (−x) − 3 (−x) − 7 = x6 − 3x4 − 7 = f (x). 6

√ 33.5 33.5 54 3 = −( − )= . 7 5 35

4

This last question (“What can be said about the six roots of the equation f (x) = 0?”) is a strangely open-ended question. I don’t know what they wanted, so here I’ll just give a complete answer, though probably you didn’t need to do so much work to get the full mark. Two of the roots are ±α ≈ ±1.885.

So x6 −3x4 −7 = (x − α) (x + α) (x4 + ax2 + b) = x6 +(a − α2 ) x4 +(b − α2 a) x2 −α2 b. Comparing coefficients, a − α2 = −3, b − α2 a = 0, −α2 b = −7. From =, a = α2 − 3 and from =, b = 7/α2 . 2 (You can verify for yourself that these values of a and b also satisfy =.) √ And now solving the quadratic equation x4 + ax2 + b = 0, we have x2 = (−a ± a2 − 4b) /2. 1

2

3

1

3

Observe that a2 − 4b < 0, so that both values of x2 are imaginary. The square roots of the above values of x2 yield the other four roots, all of which are also imaginary: x=±

√

Page 1220, Table of Contents

¿ √ Á √ 2 2 2 2 Á 2 −a ± a − 4b Á À 3 − α ± (α − 3) − 28/α =± . 2 2

www.EconsPhDTutor.com

x Answer to Exercise 387 (9740 N2014/I/8). (i) ∫ f (x) dx = sin−1 + C, where C is 3 the constant of integration.

(ii)

f (x)

1 −0.5 −0.5 −0.5 = (9 − x2 ) = 9−0.5 [1 − (x/3)2 ] = [1 − (x/3)2 ] 3 2 3 ⎡ ⎤ 1 3 1 3 5 x 2 x 2 ⎢ ⎥ 2 (− ) (− ) ( ) (− ) (− ) (− ) ( ) [− ] [− ] ⎢ ⎥ 1 x 1⎢ 2 2 3 2 2 2 3 + + . . . ⎥⎥ = ⎢1 + (− ) [− ( ) ] + 3⎢ 2 3 2! 3! ⎥ ⎢ ⎥ ⎣ ⎦ 1 1 1 5 = + 3 x2 + 4 3 x4 + 7 4 x6 + . . . 3 3 ×2 3 ×2 3 ×2 (iii)

1 1 1 5 2 4 6 ∫ f (x) dx = ∫ 3 + 33 × 2 x + 34 × 23 x + 37 × 24 x + . . . dx

Ô⇒ sin−1

Page 1221, Table of Contents

1 1 1 5 5 =C + x+ 4 x3 + x + x7 + . . . 4 3 7 4 3 3 ×2 5×3 ×2 7×3 ×2

1 1 5 x 1 5 = x+ 4 x3 + x + x7 + . . . 4 3 7 4 3 3 3 ×2 5×3 ×2 7×3 ×2

www.EconsPhDTutor.com

Answer to Exercise 388 (9740 N2014/I/10). (i) −0.25 = k (1 + 0.5 − 0.52 ) = 1.25k Ô⇒ k = −0.2. (ii) 1 + x − x2 = − (x − 0.5) + 1.25. So 2

∫

1

1.25 − (x − 0.5)

2

dx = ∫ −0.2 dt

Ô⇒

√ −1 1.25 + (x − 0.5) √ ln ∣ √ ∣ + C = −0.2t 2 1.25 1.25 − (x − 0.5)

Note that because 0 ≤ x ≤ 0.5, we can remove the absolute operator. Ô⇒

√ √ 5 − (2x − 1) + B = t. 5 ln √ 5 + (2x − 1)

When t = 0, x = 0.5, so B = 0. Altogether then,

√ √ 5 − (2x − 1) t = 5 ln √ . 5 + (2x − 1)

(iii)(a)

(iii)(b)

(iv)

x

t(x = 0.25) = t(x = 0) =

√ √

√ 5 + 0.5 . 5 ln √ 5 − 0.5 √ 5+1 5 ln √ . 5−1

√ √ 5 − (2x − 1) 2 5 = −1 + √ et/ 5 = √ 5 + (2x − 1) 5 + (2x − 1) √ √ 2 5 √ Ô⇒ = 5 + 2x − 1 et/ 5 + 1 √ √ 5 Ô⇒ x = √ + 0.5 (1 − 5) . et/ 5 + 1 √

x = f (t) O

Page 1222, Table of Contents

t

www.EconsPhDTutor.com

Answer to Exercise 389 (9740 N2014/I/11). (i) h = √ 3 2 3 2 2πr /3 + πr h/3 = 2πr /3 + πr 16 − r2 /3.

√ √ 42 − r2 = 16 − r2 . So V =

√ dV π πr √ 0.5(−2r) r2 = 2πr2 + (2r 16 − r2 + r2 √ ) = 2πr2 + (2 16 − r2 − √ ) dr 3 3 16 − r2 16 − r2 πr 32 − 3r2 2 2 2 [2 (16 − r ) − r ] = πr [2r + √ = 2πr + √ ]. 3 16 − r2 3 16 − r2 dV 32 − 3r2 = 0 ⇐⇒ r = 0 (clearly not the maximum) or 2r + √ =0 dr 3 16 − r2 √ 1 Ô⇒ 2r (3 16 − r2 ) = 3r2 − 32 Ô⇒ 36r2 (16 − r2 ) = 9r4 + 322 − 6 × 32r2 Ô⇒ 0 = 45r4 − 768r2 + 1024.

(ii) r ≈ 1.207, 3.951 (calculator). (iii)

dV dV ∣ ≠ 0, ∣ = 0.97 dr r≈1.207 dr r≈3.951

Presuming that V is maximised at a positive root to the equation 0 = 45r4 − 768r2 + 1024, we have r1 ≈ 3.951 and h ≈ 0.625460188. V

V

(iv) 97

O

4

r

dV dV 1 = 0. So how can it be that ∣ ≠ 0? The reason is that at step Ô⇒ , we dr dr r≈1.207 applied a squaring operation. This resulted in additional solutions for our eventual equation 0 = 45r4 − 768r2 + 1024 that dV = 0. were not shared by the equation dr Remark: We found r ≈ 1.207 by setting

Here’s a simple example to illustrate: Say we have the equation x = 2. We square both sides to get x2 = 4. We now conclude that x = ±2. But in fact the additional solution x = −2 should be rejected.

Page 1223, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 390 (9740 N2014/II/2). 9x2 + x − 13 A Bx + C (A + 2B)x2 + (2C − 5B)x + 9A − 5C = + = . (2x − 5)(x2 + 9) 2x − 5 x2 + 9 2x − 5

Comparing coefficients, A + 2B = 9, 2C − 5B = 1, and 9A − 5C = −13. Take 2.5× = plus = 3 plus 0.4× = to get 2.5A + 3.6A = 2.5 × 9 + 1 + 0.4 × (−13) = 6.1A = 18.3 Ô⇒ A = 3. So we also have C = 8 and B = 3. Hence, 1

∫0

2

2

3

1

2

9x2 + x − 13 dx (2x − 5)(x2 + 9)

⎡ ⎤2 RRR <0 RRR ⎢ 3 3x + 8 8 x ⎥⎥ ⎢ 3 RR 3 =∫ + 2 dx = ⎢ ln RRRRR2x − 5RRRRR + ln (x2 + 9) + tan−1 ( )⎥ ⎢ 2 RR x +9 3 3 ⎥⎥ 0 2x − 5 RRR 2 ⎢ R R R ⎣ ⎦0 2 3 (5 − 2 × 2)(2 + 9) 8 2 0 3 13 8 2 = ln + [tan−1 ( ) − tan−1 ( )] = ln + tan−1 ( ) . 2 5×9 3 3 3 2 45 3 3 2

Page 1224, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 391 (9740 N2013/I/5).

(i) Observe that f (x) =

√

x 2 1−( ) a

x 2 2 implies [f (x)] + ( ) = 1, so this portion of the graph describes an ellipse with horizontal a intercepts a and vertical intercepts 1. The graph in the regions [−4a, −a) and [2a, 5a) looks the same as that in the region [−a, 2a). Altogether then: y y

y = f(x) x

O

√

√

√

3a/2 3a/2 3 (ii) Note that a < a. And so ∫ f (x) dx = ∫ 2 a/2 a/2

Now use the given substitution: √

∫a/2

3a/2

√

√

x2 1 − 2 dx. a

√

π/3 √ a2 sin2 θ 1 − 2 dx = ∫ 1− a cos θ dθ = ∫ 1 − sin2 θa cos θ dθ 2 a a π/6 π/6 π/3 π/3 cos 2θ + 1 π/3 a sin 2θ 2 a a cos θ dθ = ∫ =∫ dθ = [ + θ] 2 2 2 π/6 π/6 π/6 √ √ π/3 a sin 2θ πa a 0.5 3 − 0.5 3 π π = [ + θ] = [ + − ]= . 2 2 2 2 3 6 12 π/6

x2

Page 1225, Table of Contents

π/3

www.EconsPhDTutor.com

1 Answer to Exercise 392 (9740 N2013/I/10). (i) ∫ dz = ∫ dx 3 − 2z

Ô⇒ −0.5 ln (3 − 2z) = x + C Ô⇒ z = 1.5 − 0.5e−2(x+C) = 1.5 − 0.5Ae−2x , where C is the constant of integration and A = e−2C . (ii)

(iii)

dy = 1.5 − 0.5Ae−2x . So y = 1.5x + 0.25Ae−2x + B, where B is the constant of integration. dx d2 y = Ae−2x . So Ae−2x = a (1.5 − 0.5Ae−2x ) + b. And thus, a = −2 and b = 3. 2 dx

(iv) Two of these lines have equations y = 1.5x (where A = 0 = B) and y = 1.5x + 1 (where A = 0, B = 1) . A non-linear member of the family of curves is y = 1.5x + e−2x (where A = 4). For this member, as x → ∞, y → 1.5x, so it has y = 1.5x has an asymptote.

y y = 1.5x + e-2x

O

x

y = 1.5x

Page 1226, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 393 (9740 N2013/I/11). (i)

dy dy dx = ÷ = 6t2 ÷ 6t = t. dx dt dt

So the tangent has equation y − 2t3 = t (x − 3t2 ) or y = tx − t3 .

(ii) The two tangents have equations y = px − p3 and y = qx − q 3 . At their intersection p3 − q 3 3 3 point(s), px − p = qx − q or x = = p2 + pq + q 2 (the division by p − q is permitted if p−q we assume that the points P and Q are distinct, as ). y = px − p3 = p (x − p2 ) = p (pq + q 2 ). Altogether, R = (p2 + pq + q 2 , p (pq + q 2 )).

If pq = −1, then R = (p2 − 1 + q 2 , −p − q). And indeed xR = p2 − 1 + q 2 = p2 + q 2 + 2pq + 1 = 2 (−p − q) + 1 = yR . That is, R satisfies the equation x = y 2 + 1 and thus lies on the curve described by that equation. (iii) The point M is on the curve C and thus satisfies x = 3t2 and y = 2t3 . It is also on the 2 3 1 2 3 curve L and thus satisfies x = y 2 + 1. Plug = and = into = to get 3t2 = (2t3 ) + 1 = 4t6 + 1 or 4t6 − 3t2 + 1 = 0, as desired. 1

2

By observation, t2 = −1 solves this last equation. So t = ±i are a pair of possible solutions. But these two imaginary solutions cannot possibly correspond to M .

Let’s look for the other two solutions. We have that 4t6 − 3t2 + 1 = (t2 + 1) (4t4 + at2 + b) = 4t6 + (a + 4) t4 + (a + b)t2 + b. So a = −4 and b = 1. Altogether then, 4t6 − 3t2 + 1 = 2 (t2 + 1) (4t4 − 4t2 + 1) = (t2 + 1) (2t2 − 1) . √ √ So the other two solutions are t = ± 0.5. At the point M , y ≥ 0, so it must be that t = 0.5 √ √ √ 2 3 and M = (3 ( 0.5) , 2 ( 0.5) ) = (1.5, 0.5).

√ 2x x 2 = 1.5 x1.5 . (iv) At least for the region illustrated, the curve C can be written as y = 3 3 3 And so the area below the curve C and above the x-axis between x = 0 and x = 1.5 is A=∫

1.5

0

y dx = ∫

1.5

0

√ 2 1.5 2 1 2 1 2.5 1.5 2.5 [x ] 2. x dx = = 1.5 = 0.3 0 31.5 31.5 2.5 31.5 2.5

√ At least for the region illustrated, the curve L can be written as y = x − 1. L intersects the x-axis at x = 1. And so the area below the curve L and above the x-axis between x = 0 and x = 1.5 is B=∫

1.5

1

y dx = ∫

1

1.5 √

x − 1 dx =

1 2 1√ 1√ 1.5 [(x − 1)1.5 ]1 = 0.51.5 = 0.5 = 2. 1.5 3 3 6

Altogether then, the desired area is A − B =

Page 1227, Table of Contents

2√ 2. 15

www.EconsPhDTutor.com

π √ Answer to Exercise 394 (9740 N2013/II/2). (i) AD has length x tan = 3x. So 3 √ too does BE. Hence, DE has length a − 2 3x. This is also the length of each side of the equilateral triangle shown in Fig. 2. So that triangle has area √ √ √ 2 2 π 0.5 (a − 2 3x) sin = 0.25 3 (a − 2 3x) . 3

The prism has volume equal to the area we just calculated multiplied by its height x: √ √ 2 V = 0.25 3 (a − 2 3x) x, as desired.

A

D x

x

a E

x

x

B

Fig. 1

A

Fig. 2

x

x

C

Fig. 3

√ √ √ 2 dV 1 √ = 3 [2 (a − 2 3x) (−2 3) x + (a − 2 3x) ] dx 4 √ √ √ √ √ 1√ 1√ = 3 (a − 2 3x) (−4 3x + a − 2 3x) = 3 (a − 2 3x) (a − 6 3x) 4 4 a dV a = 0 ⇐⇒ x = √ , √ . So dx 2 3 6 3

(ii)

These are the two stationary points of V as a function of x. To determine their nature, we use the second derivative test: √ √ √ √ √ d2 V 1 √ 3 = 3 [−2 3 (a − 6 3x) − 6 3 (a − 2 3x)] = (12 3x − 4a) . dx2 4 2

√ The second derivative is positive when evaluated at x = a/ (2 3) and negative when eval√ √ uated at x = a/ (6 3). Thus, the maximum value of V occurs when x = a/ (6 3): √ 2 1√ 1√ a 2 a 1 4 2 a a3 V = 3 (a − 2 3x) x = 3 (a − ) √ = a × = . 4 4 3 6 3 49 6 54

Page 1228, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 395 (9740 N2013/II/3). (i) f (0) = ln(1 + 2 sin 0) = ln 1 = 0. f ′ (x) =

2 cos x 2 cos 0 Ô⇒ f ′ (0) = = 2, 1 + 2 sin x 1 + 2 sin 0

(1 + 2 sin x)(−2 sin x) − 2 cos x(2 cos x) −2 sin x − 4 sin2 x − 4 cos2 x f (x) = = (1 + 2 sin x)2 (1 + 2 sin x)2 sin x + 2 sin 0 + 2 −2 sin x − 4 ′′ = −2 Ô⇒ f (0) = −2 = −4, = (1 + 2 sin x)2 (1 + 2 sin x)2 (1 + 2 sin 0)2 ′′

(1 + 2 sin x)2 cos x − (sin x + 2)2(1 + 2 sin x)(2 cos x) Ô⇒ (1 + 2 sin x)4 (1 + 2 sin 0)2 cos 0 − (sin 0 + 2)2(1 + 2 sin 0)(2 cos 0) 1 − (2)2(1)(2) f ′′′ (0) = −2 = −2 = 14. (1 + 2 sin 0)4 1

f ′′′ (x) = −2

So f (x) = 0 + 2x −

4 2 14 3 7 x + x + ⋅ ⋅ ⋅ = 0 + 2x − 2x2 + x3 + . . . 2! 3! 3

(ax)2 (ax)3 (nx)3 (ii) e sin nx = [1 + ax + + + . . . ] [0 + nx + 0 − + ...] 2! 3! 3! a2 n n3 3 = nx + anx2 + ( − )x . 2 6 ax

We are told that 2 = n and −2 = an, so a = −1. Hence, the third non-zero term in the 8 x3 Maclaurin series for eax sin nx is (1 − ) x3 = − . 6 3

Page 1229, Table of Contents

www.EconsPhDTutor.com

x3 ln (1 + x4 ) Answer to Exercise 396 (9740 N2012/I/2). (i) ∫ dx = + C. 1 + x4 4 (ii) Observe that dx = 2xdu. So

x 0.5 −1 −1 2 ∫ 1 + x4 dx = ∫ 1 + u2 du = 0.5 (tan u) + C = 0.5 (tan x ) + C.

(iii) 0.186 (calculator).98

Answer to Exercise 397 (9740 N2012/I/4). (i) By the Law of Sines, AB 1 AC = = sin 3π sin [π − ( 3π sin ( π4 − θ) 4 4 + θ)]

⇐⇒ AC =

sin ( 3π 4 )

sin ( π4 − θ)

where the last equality uses sin

=

sin ( 3π 4 )

sin ( π4 ) cos θ − sin θ cos ( π4 )

3π π π = sin = cos . 4 4 4

=

1 . cos θ − sin θ

(ii) cos θ = 1 − 0.5θ2 + . . . and sin θ = θ + . . . . So cos θ − sin θ = 1 − θ − 0.5θ2 + . . . . Thus, 1 1 −1 2 [1 (θ )] = = − + 0.5θ + . . . cos θ − sin θ 1 − θ − 0.5θ2 + . . . 2 = 1 + (θ + 0.5θ2 + . . . ) + (θ + 0.5θ2 + . . . ) + . . . = 1 + (θ + 0.5θ2 ) + θ2 + ⋅ ⋅ ⋅ = 1 + θ + 1.5θ2 + . . .

98

It’s actually possible to show, albeit with a lot of work, that this definite integral is equal to 1 √ √ √ √ √ √ 1 √ 8x3 2 2 −1 −1 [ 2 ln [(x + 1 − 2x) / (x + 1 + 2x)] + 2 2 tan (1 + 2x) − 2 2 tan (1 − 2x) + 4 ] 32 x +1 0 √ √ 1 √ [ 2 ln (3 − 2 2) + 2π + 4] ≈ 0.186. = ... more work ... = 32

Page 1230, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 398 (9740 N2012/I/8). (i) Apply

d to the equation: dx

dy dy dy = 2(x + y) (1 + ) = 2(x + y) + 2(x + y) dx dx dx dy ⇐⇒ 1 − 2x − 2y = (2x + 2y + 1) dx 1 − 2x − 2y dy ⇐⇒ = (provided 2x + 2y + 1 ≠ 0) 2x + 2y + 1 dx dy 2 dy 2 = ⇐⇒ =1+ , ⇐⇒ −1 + 2x + 2y + 1 dx 2x + 2y + 1 dx 1−

as desired. (Note that where 2x + 2y + 1 = 0, (ii) Apply

dy is undefined.) dx

d to the equation found in (i): dx −2 dy 4 dy d2 y = (2 + 2 ) = − (1 + ) dx2 (2x + 2y + 1)2 dx (2x + 2y + 1)2 dx 2 2 dy dy 3 = −( ) (1 + ) = − (1 + ) , as desired. 2x + 2y + 1 dx dx

dy = 0. So at any such point, the second derivative dx is equal to −(1 + 0)3 = −1 < 0. So by the second derivative test, the turning point is a maximum.

(iii) Any turning point occurs only if

Page 1231, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 399 (9740 N2012/I/10). (i) The Lower Part of the concert 2 hall has volume πr2 h and the Upper Part has volume πr3 . So the total volume is V = 3 2 3 πr k − 2 k 2r dh 2k 2 3 πr2 h + πr3 = k. So h = = − and = − − . 3 πr2 πr2 3 dr πr3 3 Lower Part has external surface area πr2 + 2πrh and Upper Part has external surface area d 2πr2 . So the total external surface area is A = πr (r + 2h + 2r) = 3πr2 + 2πrh. Apply : dr dA dh k 2 −2k 2 = 6πr + 2π (h + r ) = 6πr + 2π [ 2 − r + r ( 3 − )] dr dr πr 3 πr 3 k 4 2k 8 2k 10 2k = 6πr + 2π [ 2 − r − 2 ] = 6πr − πr − 2 = πr − 2 . πr 3 πr 3 r 3 r

dA 10 2k 10 3 3k 1/3 = 0 ⇐⇒ πr − 2 = 0 ⇐⇒ πr = 2k ⇐⇒ r = ( ) . dr 3 r 3 5π The minimum point is where

dA 1/3 = 0 ⇐⇒ 4πr3 + 2k − 4 = 0 or r = [(2 − k)/2] . And dr

k 2r k 5π 2/3 2 3k 1/3 k 1/3 5 2/3 2 3 1/3 3k 1/3 h= 2 − = ( ) − ( ) = ( ) [( ) − ( ) ] = ( ) . πr 3 π 3k 3 5π π 3 3 5 5π

(ii) We are now instead given that V = πr2 h + 2πr3 /3 = 200 and A = 3πr2 + 2πrh = 180. So r ≈ −6.75874, 3.03721, 3.72153 (calculator). 1

2

We can reject the negative solution. So the two possible values of r are 3.03721 and 3.72153.

The corresponding values of h are h =

Since r < h, we have r ≈ 3.04, h ≈ 4.88.

Page 1232, Table of Contents

90 − 1.5r ≈ 4.88, 2.12. πr

www.EconsPhDTutor.com

Answer to Exercise 400 (9740 N2012/I/11).

(i) Note that x is increasing in θ.

dy dy dx sin θ 2 sin (0.5θ) cos (0.5θ) cos (0.5θ) = ÷ = = = = cot (0.5θ) . dx dθ dθ 1 − cos θ sin (0.5θ) 2 sin2 (0.5θ)

Note that

dy is undefined where θ = 0 or θ = 2π. dx R dy dy RRRR lim RRR = 0, = ∞, θ→0 dx dx RR Rθ=π

dy = −∞. θ→2π dx lim

So at C, the gradient is 0. As θ → 0, 2π, the tangents to C become vertical. At C, dy/dx = cot (π/2) = 0.

(ii) You can certainly use your graphing calculator as an aid, but in this case we can easily figure out the graph even without your calculator and so that’s what I’ll do as an exercise. First figure out the endpoints: At the endpoints, θ = 0 Ô⇒ (x, y) = (0, 0) and θ = 2π Ô⇒ (x, y) = (2π, 0).

In between, we know that dy/dx = cot (θ/2) with dy/dx → ∞ as θ → 0, 2π. Moreover, dy/dx = 0 when θ = π. So the curve C starts with vertical slope at θ = 0, keeps increasing but at a decreasing rate until θ = π, then keeps decreasing at an increasing rate. y

Tangents vertical 2

x 0

2π

π

x=2π

(iii) ∫ x=0

y dx = ∫

θ=2π θ=0

(1 − cos θ)(1 − cos θ) dθ = ∫

2π 0

1 − 2 cos θ + cos2 θ dθ

sin 2θ 2π cos 2θ + 1 = [θ − 2 sin θ + ∫ dθ] = [1.5θ − 2 sin θ + ] = 3π. 2 4 0 0 2π

(iv) The normal to C at P has equation y −(1−cos p) = − [x − (p − sin p)] / cot (p/2). Where it crosses the x-axis, we have 0 − (1 − cos p) =

− [x − (p − sin p)] p ⇐⇒ x = cot (1 − cos p) + (p − sin p) = p. cot (p/2) 2 ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶ =sin p (See above)

Page 1233, Table of Contents

www.EconsPhDTutor.com

dy = 16x − 3x3 + A and y = dx 8x2 − 0.75x4 + Ax + B, where A and B are constants of integration.

Answer to Exercise 401 (9740 N2012/II/1).

(a)

1 1 4 + 3u 1 7 1 du = dt ⇐⇒ ln ∣ ∣+C = t, where C = − ln ∣ ∣. Altogether, ∫ 16 − 9u2 2×43 4 − 3u 24 1 1 4 + 3u t= (ln ∣ ∣ − ln 7). 24 4 − 3u

(b) ∫

dy dy dx 2 1 Answer to Exercise 402 (9740 N2011/I/3). (i) = ÷ = − 2 ÷ (2t) = − 3 . So dx dt dt t t 2 1 x 3 2 the equation of the tangent at the given point is y − = − 3 (x − p ) or y = − 3 + . p p p p

(ii) The horizontal intercept is given by 0 = −

x 3 + or x = 3p2 . So Q = (3p2 , 0). 3 p p

3 3 The vertical intercept is given by y = 0 + . So R = (0, ). p p

(iii) The mid-point is of QR is (1.5p2 , {(x, y) ∶ x = 1.5p2 , y =

1.5 , p ∈ R− ∪ R+ }. p

1.5 ). The locus of the mid-point as p varies is p

The cartesian equation of this locus is thus x =

Page 1234, Table of Contents

1.53 . y2

www.EconsPhDTutor.com

x2 x4 Answer to Exercise 403 (9740 N2011/I/4). (i) cos x = 1 − + ! + ⋅ ⋅ ⋅ = 1 − 0.5x2 + 2! 4 x4 + . . . . Thus: 24 6

6

x4 x4 cos x = (1 − 0.5x + + . . . ) = [1 + (−0.5x2 + + . . . )] 24 24 2 6×5 x4 x4 2 2 + ...) + (−0.5x + + ...) + ... = 1 + 6 (−0.5x + 24 2 24 x4 x4 = 1 − 3x2 + + 15 + ⋅ ⋅ ⋅ = 1 − 3x2 + 4x4 + . . . 4 4 6

(ii)(a)

2

∫0

a

So where a = π/4, ∫ 0

cos6 x dx = ∫ π/4

a 0

(1 − 3x2 + 4x4 + . . . ) dx = [x − x3 + 0.8x5 + . . . ]0

a

= a − a3 + 0.8a5 + . . .

π π 3 π 5 cos x dx ≈ − ( ) + 0.8 ( ) ≈ 0.5400. 4 4 4 6

cos6 x dx ≈ 0.4746. Using the first few terms of the Maclaurin series as an approx0 π π imation works well if is close to 0. In this case, is not close to 0 and so there is no 4 4 good reason why this approximation would have worked well. (b) ∫

π/4

Page 1235, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 404 (9740 N2011/I/5). (i) ⎧ ⎪ ⎪ ⎪2 − x, f (∣x∣) = ⎨ ⎪ ⎪ ⎪ ⎩2 + x,

⎧ ⎪ ⎪ ⎪2 − x, ∣f (x)∣ = ⎨ ⎪ ⎪ for x < 0. ⎪ ⎩x − 2, for x ≥ 0,

for x ≤ 2,

for x > 2.

The graph of y = f (∣x∣) is simply that of y = f (x) but with the region where x < 0 replaced by the reflection in the vertical axis of the graph y = f (x) in the region where x > 0.

The graph of y = ∣f (x)∣ is simply that of y = f (x) but with the region where y < 0 replaced by its reflection in the horizontal axis.

y y

(0, 2) y = f (|x |) (0, 2) x

y = |f (x) |

(2, 0)

(-2, 0)

x (2, 0)

(ii) You can simply write down what you observe from the graphs: f (∣x∣) = ∣f (x)∣ ⇐⇒ x ∈ [0, 2].

Or if we want to solve this algebraically and rigorously ... For x > 2, f (∣x∣) = ∣f (x)∣ ⇐⇒ 2 − x = x − 2 ⇐⇒ x = 2 (NA). For x ∈ [0, 2], f (∣x∣) = ∣f (x)∣ ⇐⇒ 2 − x = 2 − x, which is always true. For x < 0, f (∣x∣) = ∣f (x)∣ ⇐⇒ 2 + x = 2 − x ⇐⇒ x = 0 (NA). Altogether then, f (∣x∣) = ∣f (x)∣ ⇐⇒ x ∈ [0, 2]. (iii)

1

0

∫−1 f (∣x∣) dx = ∫−1 (2 + x) dx + ∫0 2

∫1 ∣f (x)∣ dx = ∫1 (2 − x) dx + ∫2 a

a

1

0

1

x2 x2 (2 − x) dx = [2x + ] + [2x − ] = 3. 2 −1 2 0 2

x2 x2 (x − 2) dx = [2x − ] + [ − 2x] = 0.5a2 − 2a + 2.5. 2 1 2 2 a

Set the two to be equal: 3 = 0.5a2 − 2a + 2.5 or a2 − 4a − 1 = 0 or a = 2 ± √ √ We can reject a = 2 − 5 < 2. Thus a = 2 + 5. Page 1236, Table of Contents

√ 5.

www.EconsPhDTutor.com

1 1 10 + v Answer to Exercise 405 (9740 N2011/I/8). (i) ∫ dv = ln + C, 100 − v 2 20 10 − v where C is the constant of integration. 1 (ii) (a) ∫ dv = ∫ dt 10 − 0.1v 2

⇐⇒

10 ∫ 100 − v 2 dv = t.

1 10 + v 1 10 So t = ln ∣ ∣ + D. At t = 0, v = 0, so D = − ln ∣ ∣ = 0. Moreover, v ≤ 10. Hence, 2 10 − v 2 10 1 10 + v . t = ln 2 10 − v t(v = 5) = (b) t =

1 15 1 ln = ln 3. 2 5 2

1 10 + v 10 + v 20 20 ln ⇐⇒ e2t = = −1 + ⇐⇒ v = 10 − 2t . 2 10 − v 10 − v 10 − v e +1

At t = 1, v = 10 − (c) lim v = 10.

20 . e2 + 1

t→∞

Answer to Exercise 406 (9740 N2011/II/2). (i) The box has length 2(n−x), breadth n − 2x, and height x. It thus has volume V = 2(n − x)(n − 2x)x = 2n2 x − 6nx2 + 4x3 . (ii)

dV dV = 2n2 − 12nx + 12x2 . = 0 if and only if dx dx √ √ √ 12n ± (−12n)2 − 4(12) (2n2 ) 12n ± 144n2 − 96n2 3n x= = = 0.5n ± . 24 24 6

The larger value of x√may be rejected because in that case, the breadth of the box would √ be 3n 3 n−2x = n−2 (0.5n + ) < 0. So the only stationary value of V occurs at x = (0.5 − ) n. 6 6

Page 1237, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 407 (9740 N2011/II/4). (a) (i) u n© 2 −2x

−2x n

e dx = [x ] −∫ ∫0 x ± −2 0 0 ′ 2e

v

n

∫0

(b)

∫0

1

u

e−2x dx −2 n e−2x 2 −2n −2n = −0.5n e − 0.5ne + 0.5 [ ] −2 0

= −0.5n2 e−2n + [x

(ii)

n e−2x e−2n © 2x dx = n2 + ∫ x e−2x dx ± −2 −2 0 ′

∞

n e−2x

] − −2 0 ∫0

v

n

1 1 = −0.5n2 e−2n − 0.5ne−2n − 0.25 (e−2n − 1) = − e−2n (2n2 + 2n + 1) + . 4 4 2 −2x

xe

1 1 dx = lim ∫ x2 e−2x dx = lim [− e−2n (2n2 + 2n + 1) + ] n→∞ 0 n→∞ 4 4 1 1 = − lim [e−2n (2n2 + 2n + 1)] + 4 n→∞ 4 1 1 1 = − [ lim (2e−2n n2 ) + lim (2e−2n n) + lim (2e−2n )] + = . n→∞ n→∞ 4 n→∞ 4 4 n

π/4 4x 2 4 tan θ 2 2 πy dx = ∫ π ( 2 π( ) dx = ∫ ) sec θ dθ x +1 0 0 tan2 θ + 1 π/4 tan θ sec θ 2 π/4 tan θ sec θ 2 ( ( ) dθ = 16π ∫ ) dθ = 16π ∫ sec2 θ 0 0 tan2 θ + 1 π/4 1 − cos 2θ π/4 π/4 tan θ 2 2 sin θdθ = 16π ∫ ( ) dθ = 16π ∫ dθ = 16π ∫ sec θ 2 0 0 0 θ sin 2θ π/4 π 1 = 16π [ − ] = 16π ( − ) = 2π 2 − 4π. 2 4 0 8 4 2

1

Answer to Exercise 408 (9740 N2010/I/2). (i) ex = 1 + x + 0.5x2 + . . . and 1 + sin 2x = 1 + 2x + . . . . Thus, ex (1 + sin 2x) = (1 + x + 0.5x2 + . . . ) (1 + 2x + . . . ) = 1 + 3x + 2.5x2 + . . . (ii) (1 +

So 3 =

4x n 4nx n(n − 1) 4x 2 4nx 8n(n − 1)x2 ) =1+ + ( ) + ⋅⋅⋅ = 1 + + + .... 3 3 2 3 3 9

4n 9 8n(n − 1) 5 or n = . And so = 2 ⋅ = 2.5, as desired. 3 4 9 4

Page 1238, Table of Contents

www.EconsPhDTutor.com

d Answer to Exercise 409 (9740 N2010/I/4). (i) Apply to the given equation: dx dy dy dy dy y + x 2x − 2y + 2y + 2x = 0 ⇐⇒ 2(x − y) = −2(x + y) ⇐⇒ = , where the division dx dx dx dx y − x by x−y is permitted because it is impossible that y −x = 0. If x = y, then the given equation implies 2x2 + 4 = 0, which has no real solutions. dy = 0 or y + x = 0 or y = −x. Plugging this dx √ 2 2 2 2 into the given equation, we have x − x − 2x + 4 = 0 of x = 2 or x = ± 2. The points are √ √ (∓ 2, ± 2).

(ii) The tangent is parallel to the x-axis where

Answer to Exercise 410 (9740 N2010/I/6). (i) β ≈ 0.347 and γ ≈ 1.532 (calculator). (ii) ∣∫ β

γ

x2 x4 x − 3x + 1 dx∣ = ∣[ − 3 + x] ∣ ≈ 0.781417. 4 2 β γ

3

√ (iii) The red line has equation y = 1. It intersects the curve at x = 0 and again at x = − 3. So the desired area is 0

x4 x2 3 9 9 x − 3x dx = [ − 3 ] √ =− +3× = . ∫−√3 4 2 − 3 4 2 4 0

3

(iv) The graph of the equation x3 − 3x + 1 = k always has maximum and minimum turning points x = −1 and x = 1. To ensure that there are three real distinct roots, the maximum turning point must be above the horizontal axis AND the minimum turning point must be below the horizontal axis. Hence, we must have (−1)3 − 3(−1) + 1 − k > 0 or k < 3 AND 13 − 3(1) + 1 − k < 0 or k > −1. Altogether, k ∈ (−1, 3).

Page 1239, Table of Contents

www.EconsPhDTutor.com

dθ Answer to Exercise 411 (9740 N2010/I/7). (i) The differential equation is = dt dθ k(20 − θ). We are told that initially, t = 0, θ = 10, = 1. Hence, k = 0.1. So the differential dt dθ = 0.1(20 − θ) = 2 − 0.1θ. equation is dt 1 dθ = ∫ dt ⇐⇒ −10 ln(20−θ) = t+C, where C is the constant of integration. 2 − 0.1θ Rearranging, θ = 20 − e−0.1(t+C) = 20 − Be−0.1t , where B is the constant of integration.

So ∫

But at t = 0, θ = 10, so B = 10. Altogether then, θ = 20 − 10e−0.1t , as desired.

(ii) θ = 20 − 10e−0.1t = 15 ⇐⇒ e−0.1t = 0.5 ⇐⇒ −0.1t = ln 0.5 ⇐⇒ t = 10 ln 2 ≈ 6.931 min. As t → ∞, θ → 20.

Ʌ 20

Ʌ = 20 - 10e-0.1t

O

Page 1240, Table of Contents

t

www.EconsPhDTutor.com

Answer to Exercise 412 (9740 N2010/I/9). (i) V = 3x2 y = 300 Ô⇒ y = 1

A = 2xy +6xy +3x2 +2kxy +6kxy +3x2 = (8+8k)xy +6x2 . Plugging in =, A = 100 Differentiating: 1

100 . x2

8 + 8k +6x2 . x

dA 8 + 8k = −100 2 + 12x dx x dA = 0 ⇐⇒ −100(8 + 8k) + 12x3 = 0 dx 1/3 1/3 100(8 + 8k) 200 ⇐⇒ x = [ ] =[ (1 + k)] . 12 3

(ii)

y 100 100 200(1 + k) 3 1.5 = 2 /x = 3 = 100/ = = . x x x 3 2(1 + k) 1 + k

(iii) k ∈ (0, 1] Ô⇒

1.5 ∈ [0.75, 1.5). 1+k

(iv) I interpret “ends” to mean the sides of the box that have equal y and base x. In which y case, = 1 Ô⇒ k = 0.5. x

Page 1241, Table of Contents

www.EconsPhDTutor.com

−2 2 dy dy dy dx dx 11 ++ tt−2 dy tt2 ++ 11 , for Answer to Exercise 413 (9740 N2010/I/11). (i) = ÷ = = Answer to Exercise 413 (9740 N2010/I/11). (i) dx = dt ÷ dt = 1 − t−2 −2 = t2 − 1, for dx dt dt 1 − t t2 − 1 tt ≠≠ 0, 1. So at the point P , the tangent has equation 0, 1. So at the point P , the tangent has equation

11 pp22 + + 11 11 ) yy − p + = (x − p − − p + p = p22 − 1 (x − p − p ) p p −1 p 1 2 2 1 = (p22 + 1) (x − p − 11 ) ⇐⇒ (p2 − (p2 − ⇐⇒ (p − 1) 1) yy + + (p − 1) 1) (−p (−p + + p) ) = (p + 1) (x − p − p) p p 1 1 2 2 (p (p22 + (p22 − ⇐⇒ + 1) 1) (p (p + + 1 )) + + (p − 1) 1) (−p (−p ++ 1 )) == 4p. 4p. (p2 + (p2 − ⇐⇒ + 1) 1) x x− − (p − 1) 1) yy = = (p pp pp

2 2 (ii) − 1) 1) x x= = 4p 4p or or x x == 2p. 2p. So So A A == (2p, (2p, 2p). 2p). (p2 + (p2 − (ii) The The point point A A is is given given by by (p + 1) 1) x x− − (p 2 2 2 2 2 So B B == 2 ,,−−2 .. The (−x) = = 4p 4p or or 2p 2p22xx == 4p 4p or or xx == 2 .. So (p2 + (p2 − The point point B B is is given given by by (p + 1) 1) x x− − (p − 1) 1) (−x) pp pp pp

Observe likewise perpenperpenObserve that that since y = x and y = −x are perpendicular, OA and OB are likewise dicular. Height or or dicular. Hence, Hence, the area of the triangle OAB is simply 0.5 × Base × Height ¿ √ √ √ 22 √ Á 2 22 2 2 11 2 88 À( ) + (− ) = 0.5 ( 8p) ∣(2p, 8p) (( )) == 4. 4. ∣(2p, 2p)∣ ∣( , − )∣ = 0.5 (2p)22 + (2p)22Á 22 p p p p pp

2 (iii) 4. (iii) Observe Observe that x + y = 2t and x − y = . Hence, (x + y)(x − y) = 4 or x22 −− yy22 == 4. t This intercepts ±2, ±2, asymptotes asymptotes This of of course course is simply an east-west hyperbola, with horizontal intercepts yy == ±x, ±x, centre centre (0, 0), and lines of symmetry y = 0 and x = 0.

y

(-0, 0) Centre x2 - y2 = 4

-2 Horizontal Intercept y=x Linear Asymptote

Page 1242, 1242, Table Table of of Contents Contents Page

x=0 Line of Symmetry y=0 Line of Symmetry x 2 Horizontal Intercept y = -x Linear Asymptote

www.EconsPhDTutor.com www.EconsPhDTutor.com

is the only stationary point. However, as we know, stationary points are either infl s or turning points. We must check that this is indeed a turning point. One w Exercise 414 (9740 N2010/II/3). (i) is isAnswer throughtothe second derivative test: √ dy √ 0.5x −4 x + 2 + 0.5x 1.5x + 2 dy √ √ √ = x + 2 + = 0 ⇐⇒ x = . = = Ô⇒ d2 ydx x + 2(1.5) dy 30.5 x +− 2 (1.5xx++ 2)(0.5)(x 2 x ++ 2 2)−0.5 dx 1.5

dx2

=

x+2

=√ − . dx x + 2 x+2

This is the only stationary point. However, as we know, stationary points are either inflexion points or turning points. We must check that this is indeed a turning point. One way to √ do this is through the second derivative test:

uated at x = −4/3, the second derivative equals 1.5/ −4/3 + 2 > 0, so this is a mini √ ng point. d2 y x + 2(1.5) − (1.5x + 2)(0.5)(x + 2)−0.5 1.5 dy 0.5 √ = = . √ − 2 2 2 dx x + 2 dx x + 2 x + 2 a) The equation y = x (x + 2) is equivalent to the equation y = ±x x + 2. √ √ he curve the graphs of two ↦ x x+2 Evaluated x =decomposed −4/3, the secondinto derivative equals 1.5/ −4/3 +functions 2 > 0, so thisf is∶ axminimum √ mayatbe point. x + 2, both with domain [−2, ∞) and codomain R. ↦ −xturning √ 2 2 √ √ (ii) (a) The equation y = x (x + 2) is equivalent to the equation y = ±x x + 2. ave f ′ (x) = (1.5x + 2) / x + 2 and g ′ (x) = − (1.5x + 2) / √x + 2. Thus, two po √ the √ 2 and curve decomposed into the ∶x↦ x x=+ − √ may s of So thethegradient atbethe point where x =graphs 0 are off ′two (0)functions = 2/ 2 fand g ′ (0) 2. g ∶ x ↦ −x x + 2, both with domain [−2, ∞) and codomain R. √ √ We have f ′ (x) = (1.5x + 2) / x + 2 and g ′ (x) = − (1.5x + 2) / √x + 2. Thus, the √ two possible ′ ′ values of the gradient at the point where x = 0 are f (0) = 2/ 2 and g (0) = − 2.

The easy way is to just use your graphing calculator and copy. But as an exer try to do this without a graphing calculator. We first graph y = x2 (x + 2x2 . (b) This theway cubic withusehorizontal intercepts −2 copy. and 0But and interce Theiseasy is to just your graphing calculator and as vertical an exercise, 2 2 2 2 let’s+ try do this graphing firststationary graph y = x points (x + 2) =are 0 x = 3x 4x =tox(3x + 4)without and d ay/dx = 6xcalculator. + 4. So theWetwo 3 2 + 2x . being This is athe cubic withturning horizontalpoint intercepts 0 anda vertical intercept 0. thexformer minimum and −2 theand latter maximum turning p 2 2 2

dy/dx = 3x + 4x = x(3x + 4) and d y/dx = 6x + 4. So the two stationary points are 0 and −4/3, being a minimum turning point and the a maximum axis. turningMore ow graph y 2 =thex2former (x+2). that this is symmetric inlatter the horizontal 2 Recall 2 point. We now graph y = x (x + 2). Recall that this is symmetric in the horizontal axis. e x2 (x + 2) < where 0, thex2graph Moreover, (x + 2)
y

x

(... Answer continued on the next page ...)

Answer on the next page ...) Pagecontinued 1243, Table of Contents

www.EconsPhDTutor.com

aphing calculator and copy.

ut as(...an exercise, let’s do this a graphing Answer fromtry the to previous page without ...) √continued √ calculator. Obser 2.25x2 + 414 6x + 4 1.5x + 2 to Exercise 1 Answer N2010/II/3) (iii) The easy way is to just use your . Hence, we want to graph y = 2.25x + 1.5 + . at √graphing=calculator and copy.(9740 x+2 x+2 x+2

But as an exercise, try to do this without a graphing √ 2 + let’s √ calculator. Observe 2.25x 6x + 4 1 1.5xy+=2 2.25x2 + 6x + =4 2.25x+1.5+ 1 asymptot e firstthat graph . This is the hyperbola with √ = . Hence, we want to graph y = 2.25x + 1.5 + . x + 2 x + 2 x+2 x+2 x + 2 = 2.25x + 1.5 and x = −2 and centre (−2, −3).

2.25x2 + 6x +√ 4 1 = 2.25x+1.5+ . This is the hyperbola with asymptotes x+2 x + 21 y = 2.25x 1.5 and x = of −2 yand (−2,+−3). is simply the region of the graph 2.25x 1.5 + ow note that+the graph = centre x + 2 √ 1 1 = 2.25x 1.5 + is simply the region of the graph of Now+note that the where graph ofy y>=0. 2.25x + 1.5 + x+2 x+2 1 y 2 = 2.25x + 1.5 + where y2.25x > 0. 2 + 6x + 4 x+2 has asymptotes y = 2.25x + 1.5 and x = − ote that since the graph of y = 2++26x + 4 x 2.25x √ the graph of y = Note that since has asymptotes y = 2.25x + 1.5 and x = −2, 2 + 6x + 4 √ 2.25x x + 2 √ e graph of y = must also have asymptotes y√= 2.25x + 1.5 and x = − 2 + 6x + 4 2.25x x + 2 the graph√of y = must also have asymptotes y = 2.25x + 1.5 and x = −2. x + 2 √ + 1.5 is not a oblique asymptote, but it is an asymptote nonethele course, y = 2.25x Of course, y = 2.25x + 1.5 is not a oblique asymptote, but it is an asymptote nonetheless. We first graph y =

vertical asymptote

-12

Page 1244, Table of Contents

-8

-4

27 24 21 18 15 12 9 6 3 0 0-3 -6 -9 -12 -15 -18 -21 -24 -27 -30 -33

y

4

x 8

www.EconsPhDTutor.com

Answer to Exercise 415 (9740 N2009/I/2). ∫0

1

1 1 1 1 1 1 1 dx = ( + ) dx = [− ln(2 − x) + ln(2 + x)]0 ∫ 2 4−x 4 0 2−x 2+x 4 1 ln 3 = [0 + ln 3 + ln 2 − ln 2] = . 4 4 ∫0

So p =

1/2p

4π 2π = . 6 ln 3 3 ln 3

sin−1 (px) √ dx = [ ] p 1 − p2 x 2 0

1/2p

1

=

sin−1 (0.5) π = . p 6p

Answer to Exercise 416 (9740 N2009/I/4). (i) f (27) = f (3) = 5 and f (45) = f (1) = 6. So f (27) + f (45) = 11. (ii)

y

y = f (x)

x

O

(iii)

3

∫−4

f (x) dx = ∫

=∫

−2

−4 2 0

f (x) dx + ∫

7 − x dx + ∫ 2

2

0 −2 4

2

f (x) dx + ∫

2 0

2x − 1 dx + ∫

f (x) dx + ∫

2 0

3 2

7 − x dx + ∫ 2

x3 4 3 = 2 × [7x − ] + [x2 − x]2 + [x2 − x]2 3 0 2 8 = 2 × (14 − ) + 16 − 4 + 9 − 3 − 2 × (4 − 2) = 36 . 3 3

Page 1245, Table of Contents

f (x) dx 3

2

2x − 1 dx

www.EconsPhDTutor.com

Answer to Exercise 417 (9740 N2009/I/7). (i) f (0) = ecos 0 = e1 = e. f ′ (x) = ecos x (− sin x) and so f ′ (0) = ecos 0 (− sin 0) = 0. f ′′ (x) = ecos x (− sin x)(− sin x) − ecos x cos x x2 ′′ cos 0 cos x and so f (0) = 0 − e cos 0 = −e. So e = e + 0x + (−e) = e − 0.5ex2 . 2! 1 1 b 2 −1 1 b 2 1 b 1 (ii) = (1 + x ) = [1 + (−1) ( x ) + . . . ], so e = and −0.5e = − . Or a = a + bx2 a a a a a a2 e 1 and b = 0.5ea2 = . 2e Answer to Exercise 418 (9740 N2009/I/11). (i) y y = f (x)

O

x

√ 2 2 2 (ii) f ′ (x) = e−x + xe−x (−2x) = e−x (1 − 2x2 ) = 0 ⇐⇒ x = ± 0.5. These are the two stationary points. Remember that stationary points are either turning points or points of inflexion. Here we are asked for the turning points. So we need to check whether these two points are turning points or points of inflexion. √ √ 2 2 2 f ′′ (x) = e−x (−2x) (1 − 2x2 )+e−x (−4x) = −2xf ′ (x)−4e−x x. f ′′ (− 0.5) > 0 and f ′′ ( 0.5) < 0, so the former is a minimum turning point and the latter is a maximum turning point. √ √ So the two turning points are (± 0.5, ± 0.5e−0.5 ). n

(iii) ∫ xe 0

−x2

dx = 0.5 ∫

n2

0

So the desired area is 0.5. 2

e−u du = −0.5 [e−u ]0 = 0.5 (1 − en ) Ô⇒ lim ∫ xe−x dx = 0.5 n→∞ 0 n2

2

2

n

2

(iv) ∫ ∣xe−x ∣ dx = 2 ∫ xe−x dx = 1 − e4 . −2 0 (v) π ∫

2

1

0

y dx = π ∫

1

2

0

2

2

(xe−x ) dx ≈ 0.1157021808561729π ≈ 0.363 (calculator).

Page 1246, Table of Contents

2

www.EconsPhDTutor.com

Answer to Exercise 419 (9740 N2009/II/1). (i) Use your graphing calculator and copy. (You are not supposed to know how to graph this without a graphing calculator.)

y

x

dy dy dx 3t2 + 2t 3 × 22 + 2 × 2 = ( )÷( ) = , which when evaluated at t = 2 is equal to = 2. dx dt dt 2t + 4 2×2+4 So the equation of l is y = 2x + c, where c = 23 + 22 − 2 (22 + 4(2)) = −12. So the equation of l is y = 2x − 12. (ii)

(iii) The intersection points of l and C are given by t3 +t2 = 2 (t2 + 4t)−12 or t3 −t2 −8t+12 = 0. We want to find the solutions to this last equation. We know that t = 2 is one, because P is an intersection point.

Now write t3 − t2 − 8t + 12 = (t − 2) (t2 + at + b) = t3 + (a − 2)t2 + (b − 2a)t − 2b. So a = 1 and b = −6. So t3 − t2 − 8t + 12 = (t − 2) (t2 + t − 6) = (t − 2)(t + 3)(t − 2). So a = 1 and b = −6. So there is one other intersection point, given by t = −3. ((−3)2 + 4(−3), (−3)3 + (−3)2 ) = (−3, −18). Page 1247, Table of Contents

This corresponds to Q = www.EconsPhDTutor.com

Answer to Exercise 420 (9740 N2009/II/4). (i) dn/dt = 10t − 3t2 + C and n = 5t2 − t3 + Ct + D, for constants of integration C and D.

When t = 0, n = 100, so D = 100. Below are sketched the curves of n = 5t2 − t3 + 100, n = 5t2 − t3 + t + 100, and n = 5t2 − t3 + 2t + 100.

n

100 n = 5t 2 - t 3 + 2t + 100 n = 5t 2 - t 3 + t + 100

n = 5t 2 - t 3 + 100 t

O (ii) Under the given model,

dn ≥ 0 and n ≤ 150. dt

1 ∫ 3 − 0.02n dn = ∫ dt 1 Ô⇒ 50 ∫ dn = t + A = −50 ln (150 − n) 150 − n t+A Ô⇒ e −50 = 150 − n Ô⇒

n = 150 − e −50 = 150 − Be−0.02t . t+A

When t = 0, n = 100, so B = 50. Altogether then n = 150 − 50e−0.02t . As t → ∞, n → 150. The population will approach 150 thousand. Page 1248, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 421 (9740 N2008/I/1). The green area is ∫ 1

The red area is ∫ a

4√

y 1.5 8 − a1.5 y dy = [ ] = . 1.5 a 1.5 4

2

2

x3 7 x dx = [ ] = . 3 1 3 2

The green and red areas are equal: 7 8 − a1.5 = 3 1.5

⇐⇒

7 = 16 − 2a1.5

⇐⇒

2a1.5 = 9

⇐⇒

a = 4.52/3 ≈ 2.726.

Answer to Exercise 422 (9740 N2008/I/4). (i) y = 1.5 ln (x2 + 1) + C. (ii) 2 = 1.5 ln 1 + C Ô⇒ C = 2. So y = 1.5 ln (x2 + 1) + 2. (iii) (iv)

dy → 0. dx

y y = 1.5 ln (x2 + 1) + 2

y = 1.5 ln (x2 + 1) + 1 y = 1.5 ln (x2 + 1)

O

Page 1249, Table of Contents

x

www.EconsPhDTutor.com

Answer to Exercise 423 (9740 N2008/I/5). (i) ∫0 (ii)

√ 1/ 3

∫1

e

√ 1/ 3

1 tan (3x) dx = [ ] 1 + 9x2 3 0 −1

=

√ 1 π tan−1 3 = . 3 9

e e xn+1 1 « xn+1 n ln x] − ∫ dx x ln x dx = [ ¯ n + 1 1 n+1x 1 ′ u

v

e

e xn en+1 en+1 xn+1 = −∫ dx = −[ ] n+1 n+1 (n + 1)2 1 1 n+1

=

en+1 en+1 − 1 en+1 1 − en+1 − = + . n + 1 (n + 1)2 n + 1 (n + 1)2

Answer to Exercise 424 (9740 N2008/I/6). (a) By the Law of Cosines, AC 2 = AB 2 + BC 2 −2(AB)(BC) cos θ = 1+9−6 cos θ = 10−6 cos θ. But 10−6 cos θ = 10−6 (1 − 0.5θ2 + . . . ) = √ √ 2 2 4 + 3θ + . . . . So AC = 4 + 3θ + . . . ≈ 4 + 3θ2 , as desired.

Moreover, (4 + 3θ2 + . . . ) = 40.5 (1 + 0.75θ2 + . . . ) Hence, AC ≈ 2 + 0.75θ2 , as desired. 0.5

(b) f (0) = tan(2 × 0 + π/4) = tan

0.5

= 2 [1 + 0.5 × 0.75θ2 + . . . ] = 2 + 0.75θ2 .

π = 1. 4

π π f ′ (x) = 2 sec2 (2x + ) and so f ′ (0) = 2 sec2 = 4. 4 4

π π π π f ′′ (x) = 4 sec2 (2x + ) tan (2x + ) × 2 and so f ′′ (0) = 8 sec2 tan = 16. 4 4 4 4

Hence, f (x) = 1 + 4x +

16x2 + ⋅ ⋅ ⋅ = 1 + 4x + 8x2 + . . . 2!

Page 1250, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 425 (9740 N2008/I/7). The total time to build the wall is 1 3 (x + 2y) + 9 (0.5πx) = 180. Rearranging, y = 30 − (0.75π + 0.5)x.

The flower-bed has area A = xy + 0.5π(x/2)2 . Plug in = to get A = x [30 − (0.75π + 0.5)x] + 0.5π(x/2)2 . This is a ∩-shaped quadratic with maximum point given by 1

x=

=

=

−b −30 = 2a 2 [0.5π(1/2)2 − (0.75π + 0.5)] 15 (0.75π + 0.5) − 0.5π(1/2)2

120 15 = . 0.75π + 0.5 − π/8 5π + 4

The corresponding value of y is

y = 30 − (0.75π + 0.5)x = 30 − (0.75π + 0.5)

= 30 −

Page 1251, Table of Contents

120 5π + 4

π+1 90π + 60 60π + 60 = = 60 . 5π + 4 5π + 4 5π + 4

www.EconsPhDTutor.com

Answer to Exercise 426 (9740 N2008/II/1). (i) Use your calculator and copy.

y

y = x + x2 + x3 / 6

y = ex sin x

O

(ii) ex sin x = (1 + x +

(iii) See above.

x

x2 x3 x3 x3 x3 x3 + + . . . ) (x − + . . . ) = x + x2 + − + ⋅ ⋅ ⋅ = x + x2 + + . . . . 2 6 6 2 6 3

(iv) ∣g(x) − f (x)∣ = 0.5 ⇐⇒ x ≈ −1.96, 1.56 (calculator). Hence, also from our observation of the graphs of f and g, we have ∣g(x) − f (x)∣ < 0.5 ⇐⇒ −1.96 > x > 1.56.

Page 1252, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 427 (9740 N2008/II/2). (i) 2 ∫ 0 culator). (ii) π ∫

1 0

y dx = π ∫

1

2

0

1√

√ x 1 − x dx ≈ 0.998879 (cal-

0 0 √ √ u1.5 u2.5 4π x 1 − x dx = −π ∫ (1 − u) u du = −π [ − ] = . 1.5 2.5 1 15 1

d dy √ to the equation: 2y = 1 − x + x(0.5)(1 − x)−0.5 (−1). dx dx √ dy At the maximum point, = 0. So 1 − x = 0.5x(1 − x)−0.5 or 1 − x = 0.5x or x = 2/3. dx

(iii) Apply

Answer to Exercise 428 (9233 N2008/I/2). cos 2x (2x)2 −0.5 √ = cos 2x (1 + x2 ) = [1 − + . . . ] [1 + (−0.5)x2 + . . . ] 2 2! 1+x = 1 − 0.5x2 − 2x2 = 1 − 2.5x2 Ô⇒ a = 1, b = −2.5.

Answer to Exercise 429 (9233 N2008/I/3).

∫0

1

v′

1 1 ¬ 1 e−2x 1 −2 1 e−2x e−2x −2x x e ] − dx = − e + [ ] dx = [x ® −2 0 ∫0 −2 2 2 −2 0 u

1 1 1 3 = − e−2 − (e−2 − 1) = − e−2 . 2 4 4 4

Answer to Exercise 430 (9233 N2008/I/4). ∫e

e3

3 1 3 1 1 1 3 2 t dx = e dt = dt = − [ ] = . ∫1 et t2 ∫ 1 t2 x(ln x)2 t 1 3

Page 1253, Table of Contents

www.EconsPhDTutor.com

⎧ ⎪ ⎪ ⎪x − a, Answer to Exercise 431 (9233 N2008/I/6). (i) ∣x − a∣ = ⎨ ⎪ ⎪ ⎪ ⎩a − x,

for x ≥ a,

for x < a.

y

y = |x – a | x -b

(a, 0) b

(ii) ∫ ∣x − a∣ dx = ∫ a − x dx + ∫ a −b −b a

b

b

x2 x2 x − a dx = [ax − ] + [ − ax] 2 −b 2 a a

b

b2 b2 a2 a2 = a − + ab + + − ab − + a2 = a2 + b2 . 2 2 2 2 2

Answer to Exercise 432 (9233 N2008/I/8). ∫a

∞

∞

∞1 1 tan−1 x2 1 1 dx = ∫ dx = [ ] 4 + x2 4 1 + 41 x2 4 1/2 a a 1 a a π = (tan−1 ∞ − tan−1 ) = − 0.5 tan−1 . 2 2 4 2 √

∫1/2

3/2

1

√

π π π 3/2 √ dx = [sin x]1/2 = − = . 3 6 6 1 − x2 −1

The two expressions are equal: π a π − 0.5 tan−1 = 4 2 6 Page 1254, Table of Contents

⇐⇒

a π π π 1 = tan [2 ( − )] = tan = √ 2 4 6 6 3

⇐⇒

2 a= √ . 3

www.EconsPhDTutor.com

Answer to Exercise 433 (9233 N2008/I/10). (i) x2 z (z + x

x2 z (x

dz ) = x2 + x2 z 2 Ô⇒ dx

dz dz dz ) = x2 Ô⇒ x2 (1 − xz ) = 0. So if x ≠ 0, then xz = 1, as desired. dx dx dx

1 z2 y 2 /x2 (ii) ∫ z dz = ∫ dx Ô⇒ = ln ∣x∣ + C Ô⇒ = ln ∣x∣ + C, provided x ≠ 0. x 2 2 C = 0.5 (62 /22 ) − ln ∣2∣ = 4.5 − ln 2. So the solution is

y 2 = x2 (2 ln ∣x∣ + 9 − 2 ln 2) for x ≠ 0 and y = 0 for x = 0.

dy dy dx 3 sin2 t cos t Answer to Exercise 434 (9233 N2008/I/13). (i) = ÷ = = dx dt dt 3 cos2 t (− sin t) sin t − . cos t

So the normal to the curve has gradient cos t/ sin t. Its equation is cos t (x − cos3 t) = sin t (y − sin3 t) or x cos t − y sin t = cos4 t − sin4 t, as desired. (ii) cos4 t − sin4 t = (cos2 t + sin2 t) (cos2 t − sin2 t) = 1 × cos 2t = cos 2t.

(iii) The horizontal intercept of the normal at P is given by y = 0 and thus xA cos t = cos 2t cos4 t − sin4 t = cos 2t or xA = . cos t

The vertical intercept of the normal at P is given by x = 0 and thus −yB sin t = cos4 t−sin4 t = cos 2t cos 2t or yB = − . sin t √ So by the Pythagorean Theorem, the length of AB is x2A + yB2 or √

cos 2t 2 cos 2t 2 ( ) + (− ) = cos 2t cos t sin t

= cos 2t

√

sin2 t + cos2 t = cos 2t sin2 t cos2 t

Page 1255, Table of Contents

√

√

1 1 + cos2 t sin2 t

1 cos 2t cos 2t = = = 2 cot 2t. sin2 t cos2 t sin t cos t 0.5 sin 2t

www.EconsPhDTutor.com

Answer to Exercise 435 (9233 N2008/II/1). cos 4x − cos 6x = cos (5x − x) − cos (5x + x)

= (cos 5x cos x + sin 5x sin x) − (cos 5x cos x − sin 5x sin x) = 2 sin 5x sin x

∫0

π/3

√ 1 π/3 1 sin 4x sin 6x π/3 3 sin 5x sin x dx = ∫ cos 4x − cos 6x dx = [ − ] = . 2 0 2 4 6 0 16

Answer to Exercise 436 (9233 N2008/II/5). (i) Differentiating, we have 2 2x(−1) 1 2x − 2(x + 2) 1 4 1 − − = + = − 2 2 1 + x x + 2 (x + 2) 1+x (x + 2) 1 + x (x + 2)2 =

(x + 2)2 − 4(1 + x) (1 + x)(x + 2)2

=

x2 . (1 + x)(x + 2)2

We are supposed to say that ln(1 + x) ∈ R Ô⇒ 1 + x > 0, and so the above expression is never negative. But see remark in footnote.99 (ii) When evaluated at x = 0, ln(1 + x) − 2x/(x + 2) = 0. Equivalently, ln(1 + x) = 2x/(x + 2).

We showed in (i) that ln(1+x)−2x/(x+2) is non-decreasing. And so ln(1+x)−2x/(x+2) ≥ 0 for all x ≥ 0. Or equivalently, ln(1 + x) ≥ 2x/(x + 2). Answer to Exercise 437 (9740 N2007/I/4).

(i) 4 ∫

4 2 − e−0.75(t+C) 2 − ln ∣2 − 3I∣ = t + C Ô⇒ I = = + Ae−0.75t . 3 3 3

1 dI = ∫ dt Ô⇒ 2 − 3I

Since t = 0 Ô⇒ I = 2, we have A = 4/3. Altogether then, I = 2 (1 + 2e−0.75t ) /3. (ii) As t → ∞, I → 2/3.

99

Author’s remark: The writers of this question made the elementary mistake of failing to state the domain and codomain of the function. They simply presumed that the codomain MUST somehow be C. But now that we’ve learnt about complex numbers, there’s no reason why we cannot have, for example, ln(1 + x) ∈ C. In which case we could certainly have 1 + x < 0, because it turns out that, for example, ln(−1) = πi.

Page 1256, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 438 (9740 N2007/I/11). (i) The easy way is to plot the curve on your graphing calculator and just copy. But as an exercise, let’s try sketching the curve without using a calculator. 2 3 sin2 t cos t dy dy dx = ÷ = = −1.5 sin t, dx dt dt 2 cos t(− sin t)

d22y = d (−1.5 sin t) = d (−1.5 sin t) dt = −1.5 cos t = 3 , for t ≠ 0. dx22 dx dt dx 2 cos t(− sin t) 4 sin t

So the graph is decreasing throughout, but at a decreasing rate.

At the endpoints, t = 0 Ô⇒ (x, y) = (1, 0) and t = 0.5π Ô⇒ (x, y) = (0, 1). Altogether then, we have

(ii) The equation of the tangent at the given point is y − sin33 θ = −1.5 sin θ (x − cos22 θ). At Q, y = 0 and so x = 22/33 sin22 θ +cos22 θ. At R, x = 0 and so y = 1.5 sin θ cos22 θ +sin33 θ. Altogether then, △OQR has area 0.5 × Base × Height, or: 0.5 (22/33 sin22 θ + cos22 θ) (1.5 sin θ cos22 θ + sin33 θ) 1 = sin θ (2 sin22 θ + 3 cos22 θ) (3 cos22 θ + 2 sin22 θ) 12 1 2 = sin θ (3 cos22 θ + 2 sin22 θ)2 , as desired. 12

(iii)

∫x=0 y dx = ∫t=0.5π sin t [2 cos t(− sin t)] dt = ∫0 x=0 t=0.5π 0 x=1 x=1

t=0 t=0

=∫

00

Page 1257, 1257, Table Table of of Contents Contents Page

11

sin33 t [2 cos t(− sin t)] 1

u55 1 2u du = 2 [ ] = 5 00

2u44 du

0.5π 0.5π

4 2 cos t sin4 t dt

2 . 5

www.EconsPhDTutor.com www.EconsPhDTutor.com

Answer to Exercise 439 (9740 N2007/II/3). (i) (1 + x)n ∣x=0 = 1. d(1 + x)n d(1 + x)n ∣x=0 = n. = n(1 + x)n−1 and dx dx

d2 (1 + x)n d2 (1 + x)n n−2 ∣x=0 = n(n − 1). = n(n − 1)(1 + x) and dx2 dx2

d3 (1 + x)n d3 (1 + x)n n−3 ∣x=0 = n(n − 1)(n − 2). = n(n − 1)(n − 2)(1 + x) and dx3 dx3

Hence, (1 + x)n = 1 + nx + (ii)

n(n − 1) 2 n(n − 1)(n − 2) 3 x + x ! + ... 2! 3!

(4 − x)1.5 (1 + 2x2 )

1.5

= 41.5 (1 − 0.25x)1.5 (1 + 2x2 )

1.5

2 3 ⎡ ⎤ 3 1 3 1 ) ) ⋅ 2 ⋅ ( −x ⋅ 2 ⋅ (− 12 ) ⋅ ( −x ⎢ ⎥ 3 −x 3 2 4 2 4 ⎢ = 8 ⎢1 + ⋅ + + + . . . ⎥⎥ [1 + (2x2 ) + . . . ] 2 4 2! 3! 2 ⎢ ⎥ ⎣ ⎦

3 3 1 3 1 = 8 (1 − x + 7 x2 + 10 .x3 + . . . ) (1 + 3x2 ) = (8 − 3x + 4 x2 + 7 .x3 + . . . ) (1 + 3x2 ) 8 2 2 2 2 = 8 − 3x +

13 2 3 2 1 3 127 3 2 3 x + .x + 24x − 9x + ⋅ ⋅ ⋅ = 8 − 3x + 24 x − 8 x + ... 24 27 16 128

(iii) The binomial series expansions in (ii) are valid provided “∣ − 0.25x∣ < 1 AND ∣2x2 ∣ < 1” √ √ √ √ ⇐⇒ “x ∈ (−4, 4) AND x ∈ (− 0.5, 0.5)” ⇐⇒ “x ∈ (− 0.5, 0.5)”.

Page 1258, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 440 (9740 N2007/II/4). (i) √ √ 5π/3 1 − cos 2x 3 3 sin 2x 5π 5π sin2 x dx = ∫ dx = 0.5 [x − ] = 0.5 ( + )= + . ∫0 2 2 0 3 4 6 8 0 √ √ 5π/3 5π/3 5π 3 5π 3 5π/3 cos2 x dx = ∫ 1 − sin2 x dx = [x]0 − ( + )= − . ∫0 6 8 6 8 0 5π/3

5π/3

∫0

(ii)(a)

(ii) (b) π ∫ 0

u 0.5π © 2

0.5π

x sin x dx = [x2 (− cos x) − ∫ 2x(− cos x) dx] ± 0 ′

0.5π

v

= 2 [x sin x − ∫ sin x dx]

0.5π

0

⎡ ⎤0.5π u ⎢ © ⎥ = 2 ⎢⎢∫ x cos x dx⎥⎥ ± ⎢ ⎥ ⎣ ⎦0 v′

= π + 2 [cos x]0

0.5π

= π − 2.

(x2 sin x) dx ≈ 5.391307769139469 (calculator). 2

By the way, with a lot of work, it is actually possible to show that the exact area is π 2 (π 4 + 20π 2 − 120) /320.

Answer to Exercise 441 (9233 N2007/I/2). The first negative coefficient is ( 5 ) ( 32 ) ( 12 ) (− 12 ) ( 43 ) 405 2.5 2 4 = ⋅⋅⋅ = − . 4! 1024 4

Answer to Exercise 442 (9233 N2007/I/3). π∫

√

0.5 3 0.5

y 2 dx = π ∫

√

0.5 3 0.5

√ 0.5 3

1 tan (2x) dx = π [ ] 2 1 + 4x 2 0.5 −1

=

π π π π2 ( − )= . 2 3 4 24

Answer to Exercise 443 (9233 N2007/I/8). (sin−1 t) cos [(sin−1 t) ] u cos u2 √ dt = ∫ √ cos u du = ∫ u cos u2 du. ∫ 2 2 1−t 1 − sin u 2

∫0

1

π/4 (sin−1 t) cos [(sin−1 t) ] π/2 sin u2 π2 2 √ u cos u du = [ dt = ∫ ] = 0.5 sin ≈ 0.312. 2 0 4 0 1 − t2

Page 1259, Table of Contents

2

www.EconsPhDTutor.com

Answer to Exercise 444 (9233 N2007/I/10). (i) y = cos x and y = sin x intersect π 5π at x = and x = . Hence, with the aid of a sketch, we see that cos x > sin x ⇐⇒ 4 4 π 5π x ∈ [0, ) ∪ ( , 2π]. 4 4 (ii)

∫0

=∫

0

2π

∣cos x − sin x∣ dx

π/4

cos x − sin x dx + ∫

5π/4 π/4

sin x − cos x dx + ∫

2π 5π/4

cos x − sin x dx

= [sin x + cos x]0 + [− cos x − sin x]π/4 + [sin x + cos x]5π/4 √ √ √ 2 2 5π 5π π π = + − 1 − 2 (sin + cos ) + (sin + cos ) + 1 = 4 2. 2 2 4 4 4 4 π/4

5π/4

2π

Answer to Exercise 445 (9233 N2007/I/11). 5x + 4 A Bx + C Ax2 + 4 + Bx2 − 5Bx + Cx − 5C = + = (x − 5)(x2 + 4) x − 5 x2 + 4 (x − 5)(x2 + 4) (A + B)x2 + (−5B + C)x + 4 − 5C . = (x − 5)(x2 + 4)

Comparing coefficients, A + B = 0, −5B + C = 5, 4 − 5C = 4. So C = 0, B = −1, and A = 1. So ∫1

4

4 1 5x + 4 x 4 2 [ln (x dx = − dx = ∣x − 5∣ − 0.5 ln + 4)] ∫ 1 (x − 5)(x2 + 4) x2 + 4 1 x−5 20 = − ln 4 − 0.5 ln = −1.5 ln 4 = − ln 8. 5

Page 1260, Table of Contents

www.EconsPhDTutor.com

dy sec x tan x Answer to Exercise 446 (9233 N2007/I/13). (i) = = tan x, dx sec x d3 y d3 y d2 y dy 2 sec2 x, = 2 sec x tan x. And so indeed = 2 , as desired. dx3 dx3 dx2 dx

d2 y = dx2

d4 y (ii) 4 = 2 (2 sec2 x tan2 x + sec4 x), which equals 2 when evaluated at x = 0. dx (iii) ln(sec x) = 0 + 0x +

1 2 0 3 2 4 x2 x4 x + x + x + ⋅⋅⋅ = + + .... 2! 3! 4! 2 12

π (iv) ln (sec ) = ln (20.5 ) = 0.5 ln 2. 4

π 2 π 4 π But also from (iii), ln (sec ) = ( ) /2 + ( ) /12 + . . . 4 4 4

π 2 π 4 π2 π4 π2 π4 So ln 2 = ( ) + ( ) /6 + ⋅ ⋅ ⋅ = + + . . . . Hence indeed ln 2 ≈ + . 4 4 16 1536 16 1536

Page 1261, Table of Contents

www.EconsPhDTutor.com

d Answer to Exercise 447 (9233 N2007/I/14). (i) Apply to the given equation: dx dy 2x − 2y = A. dx Provided x ≠ 0, from the given equation, we have A = 2x − 2y

x2 − y 2 . So x

dy x2 + y 2 dy x2 − y 2 = Ô⇒ = . dx x dx 2xy

Strictly speaking, we need to also check the case where x = 0, but I doubt this was expected on the exams, so I’ll just put this bit in a footnote.100 dy dv dv 2vx2 2v dv 3v + v 3 (ii) = v + x . So v + x = − 2 =− Ô⇒ x = − . dx dx dx x + v 2 x2 1 + v2 dx 1 + v2 1 + v2 1 1 (iii) − ∫ dv = dx Ô⇒ − ln ∣3v + v 3 ∣ + A = ln ∣x∣ ∫ 3 3v + v x 3

Ô⇒

⇐⇒

B ∣3v + v 3 ∣

−1/3

= ∣x∣

B 3 = ∣x∣ ∣3v + v 3 ∣ = ∣3x3 v + x3 v 3 ∣ = ∣3x2 y + y 3 ∣ . 3

So indeed, 3x2 y + y 3 = C (where C = ±B is some constant of integration).

100

dy dy = A, we see that is undefined — except in the special case where A = 0. If dx dx dy A = 0, then the curve is simply x2 − y 2 = 0 or y = ±x, which is a big cross on the cartesian plane and = ±1. Which can dx dy also be written = (x2 + y 2 ) /(2xy). dx If x = 0, then y = 0 and so from 2x − 2y

Page 1262, Table of Contents

www.EconsPhDTutor.com

πr2 h Answer to Exercise 448 (9233 N2006/I/7). The volume of a cone is V = . In 3 πh3 this case, r = h (because of the given semi-vertical angle 45○ ). So in this case, V = . 3 3V 1/3 Equivalently, h = ( ) . π dV 1 = −2 (i.e. the volume of liquid in the cone is decreasing by 2 cm3 per dt dh dV 2 2 dh 2 1 2 = πh . Together, = and = imply = − 2. second). But we also have dt dt dt πh We are given that

After 3 minutes (180 seconds), the remaining liquid has volume V = 390 − 180 × 2 = 30 cm3 . dh 2 90 1/3 t = − 1/3 2/3 . So at this point, h = ( ) . And π dt π 90 The instantaneous rate of decrease of the depth of the liquid at 3 minutes is 2π −1/3 90−2/3 cm per second.

Answer to Exercise 449 (9233 N2006/I/8). The tangent is parallel to the x-axis d dy dy dy = 0. Apply to the equation of the curve to get: 6x + y + x + 2y = 0 or where dx dx dx dx dy (x + 2y) = −6x − y. dx

dy 1 1 = 0 ⇐⇒ −6x − y = 0 or y = −6x. Now plug = into the equation of the curve to dx √ get 3x2 + x(−6x) + (−6x)2 = 33 or 3x2 = 33 or x = ± 11. So the two desired points are √ √ (± 11, ∓6 11). So

Answer to Exercise 450 (9233 N2006/I/9). (i) d sec θ −2 −2 −1 = − (cos θ) (− sin θ) = sin θ (cos θ) = tan θ (cos θ) = sec θ tan θ. dθ

(ii) Observe that dx = sec θ tan θdθ.

π/3 1 1 √ √ dx = ∫ sec θ tan θ dθ ∫√2−1 π/4 sec θ sec2 θ − 1 (x + 1) x2 + 2x π/3 1 π =∫ tan θ dθ = − . 12 π/4 tan θ 1

Page 1263, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 451 (9233 N2006/I/12).

(i)

1 + x − 2x2 A Bx + C A + 2C + (2B − C)x + (A − B)x2 = + = . (2 − x)(1 + x2 ) 2 − x 1 + x2 (2 − x)(1 + x2 )

Comparing coefficients, A + 2C = 1, 2B − C = 1, and A − B = −2. Take = plus 2× = plus 4× = 1 + x − 2x2 1 x+1 to get 5A = −5 or A = −1. C = 1 and B = 1. So = − + . (2 − x)(1 + x2 ) 2 − x 1 + x2 1

(ii)

2

3

1

2

3

1 x+1 + 2 − x 1 + x2 1 −1 = − (1 − 0.5x) + (x + 1)(1 + x2 )−1 2 1 = − [1 + (−1)(−0.5x) + (−1)(−2)(−0.5x)2 /2! + . . . ] + (x + 1) [1 + (−1)x2 + . . . ] 2 1 = − (1 + 0.5x + 0.25x2 + . . . ) + (x + 1) (1 − x2 + . . . ) 2 = (−0.5 − 0.25x − x2 /8 + . . . ) + (x + 1 − x2 + . . . ) = 0.5 + 0.75x − 9x2 /8. −

(iii) The binomial series expansions in (ii) are valid if “∣−0.5x∣ < 1 and ∣x2 ∣ < 1” ⇐⇒ “x ∈ (−1, 1)”.

Page 1264, Table of Contents

www.EconsPhDTutor.com

c c Answer to Exercise 452 (9233 N2006/I/14). (i) The gradient of QR is ( − ) ÷ r q 1 1 q−r 1 (cr − cq) = ( − ) ÷ (r − q) = =− . r q qr(r − q) qr (ii) The line through P perpendicular to QR has gradient qr. So it has equation y −

qr(x − cp).

c = p

1 1 c c − = qr(cv − cp) ⇐⇒ − = qr(v − p) ⇐⇒ v p v p p−v 1 1 = qr(v − p) ⇐⇒ = −qr ⇐⇒ v = − . vp vp pqr

Since it passes through V , we have

d dy (iii) Observe that xy = c2 . Applying the operator, we have y + x = 0. So at P , dx dx c dy dy −c 1 + cp = 0 or = ÷ (cp) = − 2 . So the gradient of the normal at P is p2 . p dx dx p p

c c 2 2 c c 1 2 − = kp . Plug = into = to get − = kp2 = (cp − cs)p2 or s p s p 1 1 p − s 1 1 − = (p − s)p2 = or = p2 or s = 3 , as desired. s p sp sp p

(iv) Let cs − cp = k. Then 1

1 1 and P R has gradient − . Since these two lines are perpendicular, qp pr 1 −1 1 1 we must have − = 1 = pr ⇐⇒ − = p2 . But − is the gradient of QR and p2 is the qp − pr qr qr gradient of the normal at P . So QR is parallel to the normal at P . (v) QP has gradient −

Answer to Exercise 453 (9233 N2006/II/2).

(i)

dz −0.5 −1.5 = (x2 + 32) + x(−0.5) (x2 + 32) (2x) dx −1.5 −1.5 (x2 + 32 − x2 ) = 32 (x2 + 32) = (x2 + 32) .

(ii)

∫2

7

−1.5

(x2 + 32)

Page 1265, Table of Contents

1 1 −0.5 7 [7(81)−0.1 − 2(36)−0.5 ] [x (x2 + 32) ] = 2 32 32 1 7 1 1 = ( − )= . 32 9 3 72

dx =

www.EconsPhDTutor.com

92.6

Answers for Ch. 79: Probability and Statistics

Answer to Exercise 454 (9740 N2015/II/5). (i) The manager may not have all the required information to properly implement stratified sampling. For example, he may not know what proportion of the sampling population each age group composes. (ii) Decide what the age groups are and how many he wishes to survey from each group. (That is, for each age group, set a quota of respondents to be surveyed.) Then simply go around surveying customers he sees in the supermarket, until he meets the quota for each age group. (iii) The manager may unconsciously gravitate towards customers that look more friendly. He may thus not get a representative sample of his customers (many of whom look unfriendly).

Answer to Exercise 455 (9740 N2015/II/6). (i) Let X be the number of red sweets in the packet. P(X ≥ 4) = 1 − P(X < 4) = 1 − P(X = 0) − P(X = 1) − P(X = 2) − P(X = 3) = 1 − 0.7510 −

= 1 − 0.7510 −

≈ 0.247501

⎛ 10 ⎞ ⎛ 10 ⎞ ⎛ 10 ⎞ 0.759 0.25 − 0.752 0.258 − 0.753 0.257 ⎝ 1 ⎠ ⎝ 2 ⎠ ⎝ 3 ⎠

⎛ 10 ⎞ ⎛ 10 ⎞ ⎛ 10 ⎞ 0.759 0.25 − 0.752 0.258 − 0.753 0.257 ⎝ 1 ⎠ ⎝ 2 ⎠ ⎝ 3 ⎠

(ii) X ∼ B(100, 0.25). Since np = 25 > 5 and n(1 − p) > 5, the normal approximation Y ∼ N (25, 18.75) is suitable. Hence, using also the continuity correction, 29.5 − 25 P(X ≥ 30) = 1 − P(X < 30) ≈ 1 − P(Y < 29.5) = 1 − Φ ( √ ) 18.75 ≈ 1 − Φ(1.039) ≈ 1 − 0.8506 = 0.1494.

(iii) Let p = P(X ≥ 30) ≈ 0.1494 and q = 1 − P(X ≥ 30) ≈ 0.8506. Then the desired probability is ⎛ 15 ⎞ 15 ⎛ 15 ⎞ 14 ⎛ 15 ⎞ 2 13 ⎛ 15 ⎞ 3 12 q + pq + pq + p q ≈ 0.8245. ⎝ 0 ⎠ ⎝ 1 ⎠ ⎝ 2 ⎠ ⎝ 3 ⎠

Page 1266, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 456 (9740 N2015/II/7). (i) The rate at which errors are made is independent of the number of errors that have already been made. The rate at which errors are made is constant throughout the newspaper. (ii) Let E ∼ Po(6 ⋅ 1.3) = Po(7.8). Then

P(E > 10) = 1 − P(E ≤ 10) = 1 − e

−7.8

7.80 7.81 7.810 ( + + ⋅⋅⋅ + ) ≈ 0.164770. 0! 1! 10!

(iii) Let F ∼ Po(1.3n). We are given that P(F < 2) < 0.05. That is, e−1.3n (

(1.3n)0 (1.3n)1 + ) < 0.05 0! 1!

⇐⇒ e−1.3n (1 + 1.3n) < 0.05.

Let f (n) = e−1.3n (1+1.3n). From calculator, f (1), f (2), f (3) > 0.05 and f (4) < 0.05. Hence, the smallest possible integer value of n is 4.

Answer to Exercise 457 (9740 N2015/II/8). (i) x¯ =

0.80 + 1.000 + 0.82 + 0.85 + 0.93 + 0.96 + 0.81 + 0.89 ∑ xi = = 0.8825, 8 n

∑(xi − x¯)2 (0.80 − 0.8825) + (1.000 − 0.8825)2 + ⋅ ⋅ ⋅ + (1.000 − 0.8825)2 s = = ≈ 0.005592857. n−1 7 2

2

(ii) The null hypothesis is H0 ∶ µ0 = 0.9 and the alternative hypothesis is HA ∶ µ0 < 0.9. t=

0.8825 − 0.9 x¯ − µ0 √ = √ ≈ −0.661860. s/ n 0.005592857/ 9

Since, ∣t∣ < t7,0.1 = −1.415, we are unable to reject the null hypothesis at the 10% significance level.

Page 1267, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 458 (9740 N2015/II/9). (i) By indep., P(B∣A) = P(B) = 0.4.

(ii)P(A ∪ B ∪ C) = P(A) + P(B) + P(C) − P(A ∩ B) − P(A ∩ C) − P(B ∩ C) + P(A ∩ B ∩ C) = 0.45 + 0.4 + 0.3 − 0.45 ⋅ 0.4 − 0.45 ⋅ 0.3 − P(B ∩ C) + 0.1 = 0.935 − P(B ∩ C)

Ô⇒ P(A′ ∩ B ′ ∩ C ′ ) = 1 − P(A ∪ B ∪ C) = 0.065 + P(B ∩ C).

The above is true even if B and C are not independent.

And if B and C are independent, P(B ∩ C) = 0.4 ⋅ 0.3 = 0.12 and P(A′ ∩ B ′ ∩ C ′ ) = 0.185.

(iii) We know that P(A ∩ B ′ ∩ C) = P(A ∩ C) − P(A ∩ B ∩ C) = 0.135 − 0.1 = 0.035.

We want to find lower and upper bounds for P(B ∩ C). Refer to diagram below.

At one extreme, it could be that P(A′ ∩ B ∩ C) = 0, in which case P(B ∩ C) = 0.1.

At the other extreme, it could be that P(A′ ∩ B ′ ∩ C) = 0, in which case P(B ∩ C) = 0.265.

Altogether, P(A′ ∩ B ′ ∩ C ′ ) = 0.065 + P(B ∩ C) ∈ [0.165, 0.33]

In P(B ∩ C) ≥ P(A ∩ B ∩ C) = 0.1, P(B ∩ C) ≤ P(B) = 0.4, P(B ∩ C) ≤ P(C) = 0.3.

Page 1268, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 459 (9740 N2015/II/10).

30

(i)

y

25 20 15 10 5 0

x 0

5,000 10,000 15,000 20,000 25,000 30,000 35,000 40,000 45,000

(ii) (a) PMCC ≈ −0.9807.

(ii) (b) PMCC ≈ −0.9748. (ii) (c) PMCC ≈ −0.9986.

(iii) We are apparently supposed to presume that the greater the PMCC, the “better” or the “more appropriate”. So we are supposed to use (c) from part (ii). The estimated regression equation is y − y¯ = b(x − x¯), where b = ∑ xˆi ∑ yˆi / ∑ xˆ2i . So in this case, the estimated regression equation is √ P − 14.083 = −0.147 ( h − 140.986) .

(iv) Let x be the height given in metres. Then 3x = h. Thus, the above equation may be rewritten as √ P − 14.083 = −0.147 ( 3x − 140.986) .

Page 1269, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 460 (9740 N2015/II/11).

(i) 8!/ (2!2!) = 10080.

(ii) There is only one arrangement where the letters are in alphabetical order, namely AABBCEGS. Hence, the number of these arrangements in which the letters are not in alphabetical order is 10080 − 1 = 10079.

(iii) Treating the two A’s as a single unit and the two B’s as a single unit, we have 6 units altogether, so there are 6! arrangements.

(iv) Treating the two A’s as a single unit, we have 7 units altogether, so there are 7!/2! arrangements. Treating the two B’s as a single unit, we have 7 units altogether, so there are 7!/2! arrangements. Hence, the number of arrangements where there are at least two adjacent letters is 7!/2! + 7!/2! − 6! = 7! − 6!, where the subtraction of 6! is to avoid double counting.

Hence, he number of different arrangements with no two adjacent letters the same is 8!/ (2!2!) − (7! − 6!) = 5760. Answer to Exercise 461 (9740 N2015/II/12). (i) Let A1 , A2 , A3 , A4 , A5 be independent random variables with the identical distribution N (300, 202 ). Then F = A1 + A2 + A3 + A4 + A5 ∼ N (5 ⋅ 300, 5 ⋅ 202 ) and P(F > 1600) = 1 − P(F ≤ 1600) = 1 − Φ (

1600 − 1500 √ ) 520

√ = 1 − Φ ( 5) ≈ 1 − Φ(2.236) ≈ 1 − 0.9873 = 0.0127.

(ii) Let P ∼ N (200, 152 ). Then E = P1 + P2 + ⋅ ⋅ ⋅ + P8 ∼ N (8 ⋅ 200, 8 ⋅ 152 ). Then F − E ∼ N (5 ⋅ 300 − 8 ⋅ 200, 5 ⋅ 202 + 8 ⋅ 152 ) = N (−100, 3800) and 0 − (−100) P(F > E) = P(F − E > 0) = 1 − P(F − E ≤ 0) = 1 − Φ ( √ ) 3810 10 = 1 − Φ ( √ ) ≈ 1 − Φ(1.622) ≈ 1 − 0.9476 = 0.0524. 38

(iii) 0.85F +0.9E ∼ N (0.85 ⋅ 5 ⋅ 300 + 0.9 ⋅ 8 ⋅ 200, 0.852 ⋅ 5 ⋅ 202 + 0.92 ⋅ 8 ⋅ 152 ) = N (2715, 2903). P(0.85F + 0.9E < 2750) = Φ (

Page 1270, Table of Contents

2750 − 2715 √ ) ≈ Φ(0.650) ≈ 0.7422. 2903

www.EconsPhDTutor.com

Answer to Exercise 462 (9740 N2014/II/5). (i) Arrange these 10000 customers by name, alphabetically. If two customers have the exact same same, then randomly pick one to precede the other. From this list of alphabetically-sorted customers, pick every 20th customer to survey. (ii) Advantage: Each customer has equal probability of being surveyed. Disadvantage: There is the small risk that there is some periodic pattern that could bias the sample. For example, it could be that the customers are all in some country (or concentration camp), where each person has a 9-digit number for a name (e.g. 001533123) and only the most-privileged persons have 7 as the last digit of their name. If so, our proposed method would omit all the most-privileged persons. Such a pattern is obviously contrived and absurdly unlikely. In practice, it is unlikely that my proposed method of systematic sampling is any different from purely random sampling.

Answer to Exercise 463 (9740 N2014/II/6).

(ii) Ways to include only the midfielder brother =

Ways to include only the attacker brother = In total, 16800 ways.

(i)

⎛ 3 ⎞⎛ 8 ⎞⎛ 5 ⎞⎛ 6 ⎞ = 31500. ⎝ 1 ⎠⎝ 4 ⎠⎝ 2 ⎠⎝ 4 ⎠

⎛ 3 ⎞⎛ 8 ⎞⎛ 4 ⎞⎛ 5 ⎞ . ⎝ 1 ⎠⎝ 4 ⎠⎝ 1 ⎠⎝ 4 ⎠

⎛ 3 ⎞⎛ 8 ⎞⎛ 4 ⎞⎛ 5 ⎞ . ⎝ 1 ⎠⎝ 4 ⎠⎝ 2 ⎠⎝ 3 ⎠

(iii) The club now has 3 goalkeepers, 8 defenders, 3 midfielders, 5 attackers, and one player (call him Apu) who can either be a midfielder or a defender. Ways to form a team without Apu =

⎛ 3 ⎞⎛ 8 ⎞⎛ 3 ⎞⎛ 5 ⎞ = 3150. ⎝ 1 ⎠⎝ 4 ⎠⎝ 2 ⎠⎝ 4 ⎠

Ways to form a team with Apu as a midfielder =

Ways to form a team with Apu as a defender =

In total, 8820 ways.

Page 1271, Table of Contents

⎛ 3 ⎞⎛ 8 ⎞⎛ 3 ⎞⎛ 5 ⎞ = 3150. ⎝ 1 ⎠⎝ 4 ⎠⎝ 1 ⎠⎝ 4 ⎠

⎛ 3 ⎞⎛ 8 ⎞⎛ 3 ⎞⎛ 5 ⎞ = 2520. ⎝ 1 ⎠⎝ 3 ⎠⎝ 2 ⎠⎝ 4 ⎠

www.EconsPhDTutor.com

Answer to Exercise 464 (9740 N2014/II/7). (i) Let X be the number of sixes ⎛ 10 ⎞ 1 3 5 7 10!57 1 rolled. Then X ∼ B (10, ). And P(X = 3) = ( ) ( ) = = 0.155045. 6 6 3!610 ⎝ 3 ⎠ 6

1 (ii) Let Y be the number of sixes rolled. Then Y ∼ B (60, ). We have np > 5 and 6 50 n(1 − p) > 5. So Z = N (10, ) is a suitable approximate distribution for Y . Using also the 6 continuity correction, we have P(5 ≤ Y ≤ 8) ≈ P(4.5 < Z < 8.5) = Φ

⎛ 4.5 − 10 ⎞ ⎛ 8.5 − 10 ⎞ √ −Φ √ ≈ Φ(−0.520) − Φ(−1.905) ⎝ 50/6 ⎠ ⎝ 50/6 ⎠

= Φ(1.905) − Φ(0.520) ≈ 0.9716 − 0.6985 = 0.2731.

(Without using an approximation, P(5 ≤ Y ≤ 8) ≈ 0.291854.)

1 ). We have n > 20 and np < 5. 15 So B = Po(4) is a suitable approximate distribution for A. (iii) Let A be the number of sixes rolled. Then A ∼ B (60, P(5 ≤ A ≤ 8) ≈ P(5 ≤ B ≤ 8) = e−4 (

45 46 47 48 + + + ) ≈ 0.349800. 5! 6! 7! 8!

(Without using an approximation, P(5 ≤ A ≤ 8) ≈ 0.353659.)

Page 1272, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 465 (9740 N2014/II/8). in blue.

(a) Case (i) is in red and case (ii) is

y

x

(b) (i) (A) PMCC ≈ −0.9470452. (b) (i) (B) PMCC ≈ −0.974921.

(ii) It’s not at all clear which is the better model. But apparently we are supposed to say that since the second model is better because the magnitude of its PMCC is greater. In general, the estimated regression equation is y − y¯ = b(x − x¯), where b = ∑ xˆi ∑ yˆi / ∑ xˆ2i . So in this case, the estimated regression equation is P − 72590 ≈ −33659.728 (ln m − 3.657) ⇐⇒ P ≈ −33659.728 ln m + 195693.560. (iii) P(50) ≈ −33659.728 ln 50 + 195693.560 ≈ 64016.

Page 1273, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 466 (9740 N2014/II/9). (i) Let X be the number of minutes a bus is late after the new company has taken over. We’ll assume X ∼ N (µ, σ 2 ). Our null hypothesis is H0 ∶ µ = µ0 = 4.3 and our alternative hypothesis is HA ∶ µ < µ0 = 4.3.

√ √ √ (ii) The null hypothesis is not rejected if t¯ > µ0 −t9,0.1 ⋅k/ n = 4.3−1.383⋅ 3.2/ 10 ≈ 3.518.

√ √ ¯ < µ0 − t9,0.1 ⋅ k/ n or 4.0 < 4.3 − 1.383 ⋅ k/ 10 or (iii) The null hypothesis is rejected if t √ k > 0.3 10/1.383 or k 2 > 0.32 ⋅ 10/1.3832 ≈ 0.471. Answer to Exercise 467 (9740 N2014/II/10).

(i) (a) 0.1 ⋅ 0.2 ⋅ 0.1 = 0.002.

(i) (b) The probability that no ⋆ is displayed is 0.9 ⋅ 0.8 ⋅ 0.9 = 0.648. And so the probability that t least one ⋆ symbol is displayed is 1 − 0.648 = 0.352. (i) (c) P(× × +) = 0.3 ⋅ 0.1 ⋅ 0.2, P(× + ×) = 0.3 ⋅ 0.3 ⋅ 0.4, P(+ × ×) = 0.4 ⋅ 0.1 ⋅ 0.4.

Thus, the desired probability 0.006 + 0.036 + 0.016 = 0.058.

(ii) The probability that there is exactly one ⋆ is P(⋆ ⋆/ ⋆/ ) + P(/⋆ ⋆ ⋆/ ) + P(/⋆⋆/ ⋆) = 0.1 ⋅ 0.8 ⋅ 0.9 + 0.9 ⋅ 0.2 ⋅ 0.9 + 0.9 ⋅ 0.8 ⋅ 0.1 = 0.306. The probability that the symbols are ⋆, +, ◯ (in any order) is

P (⋆ + ◯) + P (⋆◯+) + P (+ ⋆ ◯) + P (◯ ⋆ +) + P (+◯⋆) + P (◯ + ⋆) = 0.1(0.3 ⋅ 0.3 + 0.4 ⋅ 0.2) + 0.2(0.4 ⋅ 0.3 + 0.2 ⋅ 0.2) + 0.1(0.4 ⋅ 0.4 + 0.2 ⋅ 0.3) = 0.017 + 0.032 + 0.022 = 0.071

Hence, the desired probability is 0.071/0.306 = 71/306 ≈ 0.232026.

Page 1274, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 468 (9740 N2014/II/11). (i) (a) Let O ∼ Po(2) and P ∼ Po(11). Then P(P > 8) = 1 − P(P ≤ 8) = 1 − e

−11

(i) (b) O + P ∼ Po(13). So

P(O + P < 15) = e−13 (

110 111 118 ( + + ⋅⋅⋅ + ) ≈ 0.768015. 0! 1! 8!

1314 130 131 + + ⋅⋅⋅ + ) ≈ 0.675132. 0! 1! 14!

(ii) Let Q ∼ Po(2n). We are given that P(Q < 3) < 0.01. That is, P(Q < 3) = e−2n (

(2n)0 (2n)1 (2n)2 + + ) = e−2n (1 + 2n + 2n2 ) < 0.01. 0! 1! 2!

Let f (n) = e−2n (1 + 2n + 2n2 ). From calculator, f (1), f (2), f (3), f (4) > 0.01 and f (5) < 0.01. Hence, the smallest possible integer value of n is 5. (iii) Let R ∼ Po(52⋅11) = Po(572). Given a large sample, we can use the normal distribution S ∼ N (572, 572)as an approximation. Hence, using also the continuity correction, P(R > 550) ≈ P(S > 550.5) = 1 − P(S < 550.5) = 1 − Φ (

550.5 − 572 √ ) 572

≈ 1 − Φ(−0.898960) = Φ(0.898960) ≈ 0.8158.

(iv) Sales may be seasonal — e.g. it may be that art collectors make most of their purchases in the northern hemisphere’s summer months. The sales of originals and prints may not be independent of each other. E.g., an art collector who buys an original Picasso might wish to also buy a few copies thereof.

Page 1275, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 469 (9740 N2013/II/5). (i) Use a computer program to randomly sort the 100000 employees into an ordered list. Pick the first 90 employees on the list. The Chief Executive’s idea of a representative sample might be to have each country’s employees proportionally represented. For example, if 10% of employees are from India, then she may want 9 of the invited employees to be from India.

(ii) Stratified sampling is more appropriate. If say 10% of employees are from India, 30% from China, 20% from Thailand, and 40% from Singapore, then we could instead pick from the list the first 9 Indian employees, the first 27 Chinese employees, the first 18 Thai employees, and the first 36 Singaporean employees.

Answer to Exercise 470 (9740 N2013/II/6). 2a − µ 1 2a − µ ≈ 1.645 ⇐⇒ ≈ σ. Ô⇒ σ 1.645

P(Y < 2a) = P (Z <

2a − µ ) = 0.95 σ

a−µ a−µ 2a − µ 1 ) = 0.25 Ô⇒ ≈ −0.674 ⇐⇒ µ − a ≈ 0.674σ ≈ 0.674 σ σ 1.645 0.674 2 ⋅ 0.674 ⇐⇒ µ (1 + ) ≈ (1 + ) a ⇐⇒ µ ≈ 1.29a. That is, k ≈ 1.29. 1.645 1.645 P(Y < a) = P (Z <

Answer to Exercise 471 (9740 N2013/II/7). (i) The probability that one packet contains a free gift is independent of why another packet contains a free gift. There is no possibility that any one packet contains two or more free gifts. ⎛ 20 ⎞ 1 1 19 19 (ii) Let F ∼ B (20, ). Then P(F = 1) = ( ) ( ) ≈ 0.377354. 20 ⎝ 1 ⎠ 20 20

(iii) Let F ∼ B (60,

for F is G ∼ Po (3).

1 ). Since n = 60 is large and np = 3 is small, a suitable approximation 20

30 31 32 33 34 35 P(F ≥ 5) ≈ P(G ≥ 5) = 1 − e ( + + + + + ) ≈ 0.184737. 0! 1! 2! 3! 4! 5! −3

(By comparison, the actual probability is P(F ≥ 5) ≈ 0.180335.) Page 1276, Table of Contents

www.EconsPhDTutor.com

(i) P(B ∩ A′ ) = P(B∣A′ )P(A′ ) =

Answer to Exercise 472 (9740 N2013/II/8). 0.8 × 0.3 = 0.24.

(ii) P(A′ ∩ B ′ ) = 1 − [P(A) + P(B ∩ A′ )] = 1 − 0.7 − 0.24 = 0.06.

(iii) P(A′ ∣B ′ ) = 1 − P(A∣B ′ ) = 0.18. P(A′ ∩ B ′ ) 0.06 = = 0.5 P(B ) = P(A′ ∣B ′ ) 0.12 ′

P(A ∩ B) = 1 − [P(A′ ) + P(A ∩ B ′ )] = 1 − 0.3 − P(A ∩ B ′ ) = 0.7 − P(A∣B ′ )P(B ′ ) = 0.7 − 0.88 × 0.5 = 0.26.

Answer to Exercise 473 (9740 N2013/II/9).

(i)

14.0 + 12.5 + 11.0 + 11.0 + 12.5 + 12.6 + 15.6 + 13.2 ∑ xi = = 12.8, n 8 2 ∑(xi − x¯)2 (14.0 − 12.8) + (12.5 − 12.8)2 + ⋅ ⋅ ⋅ + (13.2 − 12.8)2 2 s = = ≈ 2.305714. n−1 7 x¯ =

(ii) The necessary assumption is that the population is normally distributed. The null hypothesis is H0 ∶ µ0 = 13.8 and the alternative hypothesis is HA ∶ µ0 < 13.8. t=

12.8 − 13.8 x¯ − µ0 √ =√ ≈ −1.862697. s/ n 2.305714/8

Since ∣t∣ < t7,0.05 = 1.895, we are unable to reject the null hypothesis at the 5% significance level.

Page 1277, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 474 (9740 N2013/II/10). (i) In blue is case (A), in red is case (B), and in green is case (C). y

x

(ii)

150

Distance, y

100 50 Speed, x 0 0

15

30

45

60

75

90

105

120

135

150

(iii) As a function of speed, the distance travelled decreases at an increasing rate. So (A) is the most appropriate. PMCC ≈ −0.939203. (iv) In general, the estimated regression equation is y−¯ y = b(x−¯ x), where b = ∑ xˆi ∑ yˆi / ∑ xˆ2i . So in this case, the estimated regression equation is y − 135 ≈ −0.00461978 (x2 − 11850.66667) ⇐⇒ y ≈ −0.00461978x2 + 189.747528.

Thus, y(110) ≈ −0.00461978(110)2 + 189.747528 ≈ 134. Page 1278, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 475 (9740 N2013/II/11). (i) The total number of ways to 3 2 choose a code is 26 9 (1423656). The number of ways to choose a code with three different letters and two different digits is 26 ⋅ 25 ⋅ 24 ⋅ 9 ⋅ 8 (1123200). Hence, the desired probability 400 is 26 ⋅ 25 ⋅ 24 ⋅ 9 ⋅ 8/ (263 ⋅ 92 ) = ≈ 0.78895. 507 (ii) The number of ways to choose the two digits so that the second digit is larger than the 4 ˙ first is 1 + 2 + ⋅ ⋅ ⋅ + 8 = 36. Hence, the desired probability is (1 + 2 + ⋅ ⋅ ⋅ + 8)/92 = = 0.4. 9 (iii) The number of ways to choose a code with exactly two letters the same, but not two digits the same is Repeated letter Third letter © © 26 × 25 ×

Arrange these three letters Two digits © « 3! × 9⋅8 = 26⋅25⋅3⋅9⋅8 = 140400. 2!

The number of ways to choose a code with exactly two digits the same, but not exactly two letters the same is ⎛ All three All three ⎜ letters different letters same Repeated digit ⎜ ⎜ ³¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ·¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ µ © © ⎜ 26 ⋅ 25 ⋅ 24 + 26 9 ×⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎝

Hence the desired probability is

⎞ ⎟ ⎟ ⎟ ⎟ ⎟ = 9 ⋅ 26 ⋅ 601 = 140634. ⎟ ⎟ ⎟ ⎟ ⎟ ⎠

26 ⋅ 25 ⋅ 3 ⋅ 9 ⋅ 8 + 9 ⋅ 26 ⋅ 601 25 ⋅ 3 ⋅ 8 + 601 1201 = = ≈ 0.197. 263 ⋅ 92 262 ⋅ 9 6084

(iv) There are 4 ways to choose the even digit, 5 to choose the odd digit then 2 ways to arrange these two digits. Hence, there are 4 ⋅ 5 ⋅ 2 = 40 ways to choose the two digits.

There are 5 ways to choose the vowel. There are 212 ways to choose the two consonants. We can now slot in the vowel amidst the consonants in 3 different ways. Hence, there are 5 ⋅ 212 ⋅ 3 ways to choose the three letters. Altogether then, there are 5 ⋅ 212 ⋅ 3 ⋅ 4 ⋅ 5 ⋅ 2 ways to choose a code with exactly one vowel and exactly one even digit. Hence the desired probability is

Page 1279, Table of Contents

5 ⋅ 212 ⋅ 3 ⋅ 4 ⋅ 5 ⋅ 2 5 ⋅ 72 ⋅ 5 1225 = = ≈ 0.18586. 263 92 133 ⋅ 3 6591

www.EconsPhDTutor.com

Answer to Exercise 476 (9740 N2013/II/12). (i) #1. The number of people sick on a particular day is independent of how many were sick the previous day. #2. The average number of illnesses in any span of 30 days is the same, throughout the course of the year. Condition #1 may not be met if the illness is contagious. If so, we’d expect the number of people sick on a particular day to depend (positively) on how many were sick the previous day. Condition #2 may not be met if the illnesses are seasonal. For example, due to influenza, illnesses may be more common during the winter than during the summer. Let A ∼ Po(1.2) and M ∼ Po(2.7). T

(ii) Let B ∼ Po(1.2n). Then P(B = 0) < 0.01 ⇐⇒ e−1.2n < 0.01 ⇐⇒ n > (ln 0.01) /(−1.2) ≈ 3.8. Hence, the smallest number of days is 4. (iii) Let C be the total number of days of absence across both departments, over a 5-day period. Then C ∼ Po(19.5) and P(C > 20) = 1 − P(C ≤ 20) = 1 − e

−19.5

20

19.5i ≈ 0.396583. i=0 i!

∑

(iv) Let D be the total number of days of absence across both departments, over a 60-day period. Then D ∼ Po(234). Since λD = 234 is large, the normal distribution is a suitable approximation. Let E ∼ N (234, 234). Then P(200 ≤ D ≤ 250) ≈ P(199.5 ≤ E ≤ 250.5) = Φ (

250.5 − 234 199.5 − 234 √ ) − Φ( √ ) 234 234

≈ Φ(1.0786) − Φ(−2.2553) ≈ 0.8597 − 0.0120 = 0.8477.

Page 1280, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 477 (9740 N2012/II/5). (i) (a) Let +, −, D, and N denote the events “positive result”, “negative result”, “has disease”, and “no disease”. Then P(+) = P(+∣D)P(D) + P(+∣N )P(N ) = p ⋅ 0.001 + (1 − p) ⋅ 0.999 = 0.999 − 0.998p = 0.00599. (i) (b) P(D∣+) = P(D ∩ +) ÷ P(+) = P(D)P(+∣D) ÷ P(+) = 0.001p ÷ 0.00599 ≈ 0.166110.

(ii) asP(D∣+) = 0.75. But

P(D∣+) =

0.001p . 0.999 − 0.998p

So 3(0.999 − 0.998p) = 4(0.001p) or 2.997 = 2.998p or p ≈ 0.999666. Answer to Exercise 478 (9740 N2012/II/6).

(i) H0 ∶ µ0 = 14.0, HA ∶ µ0 ≠ 14.0.

(ii) H0 ∶ x¯ ∼ N (14.0, 3.82 ). Since Z0.025 = 1.96, the values of x¯ for which the null hypothesis would not be rejected are σ σ 3.8 3.8 x¯ ∈ (µ − Z0.025 √ , µ + Z0.025 √ ) = (14.0 − 1.96 √ , 14.0 + 1.96 √ ) ≈ (12.335, 15.665) . n n 20 20 (iii) The null hypothesis is rejected.

Page 1281, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 479 (9740 N2012/II/7). 15 individuals.

(i) There are 15! ways to arrange the

There are 2 ways to arrange the 2 sisters as a single unit. Counting the 2 sisters as a single unit, we have 14 units total, and there are 14! ways to arrange these 14 units. So, there are in total 2 ⋅ 14! ways to arrange the 15 individuals so that the two sisters are together.

˙ Hence, the probability that the sisters are next to each other is 2 ⋅ 14!/15! = 2/15 = 0.13.

(ii) There are 3! ways to arrange the 3 brothers as a single unit. Counting the 3 brothers as a single unit, we have 13 units total, and there are 13! ways to arrange these 13 units. So, there are in total 3! ⋅ 13! ways to arrange the 15 individuals so that the three brothers are together. We do not want the three brothers to be together. Hence, the desired probability is 1−3!⋅13!/15! = 1−6/ (14 ⋅ 15) = 1−1/35 = 34/35 ≈ 0.97142857.

(iii) There are 2 ways to arrange the 2 sisters as a single unit and 3! ways to arrange the 3 brothers as a single unit. Counting the 2 sisters as a single unit and also the 3 brothers as a single unit, we have 12 units in total, and there are 12! ways to arranges these 12 units. So, there are in total 2 ⋅ 3! ⋅ 12! ways to arrange the 15 individuals so that the 2 sisters are together and the 2 brothers are together. Hence, the desired probability is 2 ⋅ 3! ⋅ 12!/15! = 12/(13 ⋅ 14 ⋅ 15) = 2/(13 ⋅ 7 ⋅ 5) = 2/455 ≈ 0.0043956.

(iv) Let A and B denote the events that “the sisters are next to each other” and “the brothers are next to each other”. Our desired probably is P(A ∪ B). P(A ∪ B) = P(A) + P(B) − P(A ∩ B) = =

1 2 91 ⋅ 2 13 2 2 + − = + − 15 35 455 3 ⋅ 455 455 455

91 ⋅ 2 33 43 43 + = = ≈ 0.17695. 3 ⋅ 455 3 ⋅ 455 3 ⋅ 91 243

Page 1282, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 480 (9740 N2012/II/8).

100

(i)

Percentage mark, y

80 60 40 20

Week, x

0 0

1

2

3

4

5

6

(ii) The trend is one of steady improvement. After a terrible performance in Week 1, Amy resolves to work hard. Her work pays off, with her mark improving week after week. The only deviation from trend occurs on Week 5, because Amy happened to be experimenting with drugs that week. (iii) A linear model would suggest that she eventually breaks the 100% barrier, which is quite impossible. A quadratic model would suggest that her mark eventually starts falling and moreover at an increasing rate, which is quite improbable, unless of course she gets hooked on drugs. (iv) PMCC ≈ −0.929744. (v) We are supposed to say that the most appropriate choice is wherever the magnitude of the PMCC is the largest. Hence, L = 92 is the most appropriate.

(vi) In general, the estimated regression equation is y−¯ y = b(x−¯ x), where b = ∑ xˆi ∑ yˆi / ∑ xˆ2i . So in this case, the estimated regression equation is ln (92 − y) − 3.125912 ≈ −0.279599(x − 3.5) ⇐⇒ ln (92 − y) ≈ −0.279599x + 4.104510.

y ≥ 90 ⇐⇒ −0.28x + 4.10 ≤ ln 2 ⇐⇒ x ? 12.2. So she’ll get at least 90% in Week 13.

(vii) As x → ∞, y → L. An interpretation is thus that L is the best mark she can ever hope to get, no matter how long she spends studying. Page 1283, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 481 (9740 N2012/II/9). (i) The choice must be binary — a voter must be said to either support the Alliance Party or not support it. The probability that any one polled voter supports the Party is independent of whether another polled voter supports the party.

(ii) P(A = 3) + P(A = 4) =

⎛ 30 ⎞ 4 ⎛ 30 ⎞ 3 p (1 − p)27 + p (1 − p)26 ≈ 0.373068. ⎝ 4 ⎠ ⎝ 3 ⎠

(iii) (a) np = 16.5 > 5 and n(1 − p) = 13.5 > 5 are both large and so yes, the normal distribution N(16.5, 16.5 ⋅ 0.45)would be a suitable approximation for A. (iii) (b) p is large. And so, while it is certainly possible to use the Poisson distribution as an approximation, it would fare poorly.

(iv) P(A = 15) =

⎛ 30 ⎞ 15 p (1 − p)15 ≈ 0.06864. ⎝ 15 ⎠

⎡ ⎤1/15 ⎢ ⎥ ⎛ ⎞ 30 ⎥ Thus, p(1 − p) = p − p2 ≈ ⎢⎢0.06864/ ≈ 0.237900. ⎝ 15 ⎠⎥⎥ ⎢ ⎣ ⎦

Rearranging, p2 − p + 0.237900 = 0. By the quadratic formula, p ≈ 0.39, 0.61. Given that p < 0.5, we have p ≈ 0.39.

Page 1284, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 482 (9740 N2012/II/10). (i) The number of gold coins in a randomly chosen square metre is independent of how many gold coins there are in the square metre to its left. No two coins are stacked exactly on top of each other. (ii) Let G ∼ Po(0.8). Then

P(G ≥ 3) = 1 − e

−0.8

0.80 0.81 0.82 + + ) ≈ 0.0474226. ( 0! 1! 2!

(iii) Let H ∼ Po(0.8x). Then P(H = 1) = e−0.8x (0.8x) = 0.2. By calculator, x ≈ 0.323964, 3.1783. So x ≈ 0.323964.

(iv) Let I ∼ Po(80). Since λ is large, the normal distribution J ∼ N(80, 80) is a suitable approximation. Using also the continuity correction, P(I ≥ 90) ≈ P(J ≥ 89.5) = 1 − Φ (

89.5 − 80 √ ) ≈ 1 − Φ(1.062) ≈ 1 − 0.8559 = 0.1441. 80

(v) Let P ∼ Po(3). Let Z be the number of gold coins and pottery shards found in 50 m2 . Then Z ∼ Po(190). Since λ is large, the normal distribution Q ∼ N(190, 190) is a suitable approximation for Z. Using also the continuity correction, P (Z ≥ 200) ≈ P(Q ≥ 199.5) = 1 − Φ (

199.5 − 190 √ ) ≈ 1 − Φ(0.6892) ≈ 1 − 0.7546 = 0.2454. 190

(vi) Let X and Y be, respectively, the numbers of gold coins and pottery shards found in 50 m2 . Then X ∼ Po(40) and Y ∼ Po(150). Our goal is to find P(Y ≥ 3X) = P(Y − 3X ≥ 0).

Since λX = 40 and λY = 150 are both large, the normal distributions A ∼ N(40, 40) and B ∼ N(150, 150) are suitable approximations for X and Y , respectively. And in turn, B − 3A ∼ N(150 − 3 ⋅ 40, 150 + 32 ⋅ 40) = N(30, 510) is a good approximation for Y − 3X. Hence, using also the continuity correction, −0.5 − 30 30.5 P(Y − 3X ≥ 0) ≈ P(B − 3A ≥ −0.5) = 1 − Φ ( √ ) = Φ (√ ) ≈ Φ(1.3506) ≈ 0.9116. 510 510

Page 1285, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 483 (9740 N2011/II/5). (i) P(X < 40.0) = P (Z < 40.0 − µ 1 0.05 ⇐⇒ ≈ −1.645 ⇐⇒ µ ≈ 1.645σ + 40.0 σ

P(X < 70.0) = P (Z <

40.0 − µ )= σ

70.0 − µ 70.0 − µ 2 ) = 0.975 ⇐⇒ ≈ 1.96 ⇐⇒ µ ≈ −1.96σ + 70.0. σ σ

Comparing ≈ and ≈, we have 1.645σ + 40.0 ≈ −1.96σ + 70.0 ⇐⇒ 3.605σ ≈ 30.0 ⇐⇒ σ ≈ 8.3 and µ ≈ 53.7. 1

2

Answer to Exercise 484 (9740 N2011/II/6). (i) Decide what the age groups will be. Decide how many from each age group are to be interviewed (these are our quotas). Then pick, at random, residents on the street to be interviewed, until the quota for every age group is fulfilled.

(ii) Residents who are on the street may not be a representative sample of the population.

(iii) Random sampling. Acquire a complete list of the city suburb’s population. Use a computer program to randomly pick a sample. Interview this sample. No it is not realistic. First, one may be able to acquire a complete list of the city suburb’s population. Second, one may not be able to contact every member of one’s sample.

Page 1286, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 485 (9740 N2011/II/7). attempt to contact n different friends.

(i) #1. I do indeed make an actual

#2. The probability that one friend is contactable is independent of whether another friend is contactable. (ii) Assumption #1 may not hold because if say n = 100, I may run out of time before I attempt to contact all 100 different friends. Assumption #2 may not hold because my friends probably know each other and so they might be watching a movie together and their handphones are switched off. This would mean that the probability that one friend is contactable is dependent on whether another friend is contactable.

5

⎛ 5 ⎞ i 5−i 0.7 0.3 ≈ 0.551774. i=0 ⎝ i ⎠ 5

(iii) P(R ≥ 6) = 1 − ∑ P(R = i) = 1 − ∑ i=0

(iv) Since np = 28 > 5 and n(1 − p) = 12 > 5 are both large, a suitable approximation to R is the normal distribution S ∼ N (28, 8.4). Using also the continuity correction, we have 24.5 − 28 P(R < 25) ≈ P(S < 24.5) = Φ ( √ ) ≈ Φ (−1.2076) 8.4 = 1 − Φ (1.2076) ≈ 1 − 0.8863 = 0.1137.

Page 1287, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 486 (9740 N2011/II/8).

(i)

y

x

(ii) The PMCC is ≈ −0.992317 which is very large in magnitude. But this merely means that the correlation between x and y is very strong. It does not also imply that their true relationship is definitely linear. Indeed in this case, it appears that the relationship is not linear.

(iii) We are supposed to say that the larger the magnitude of the PMCC, the better the model. In this case, the PMCC of y and x2 is −0.999984. And so we’re supposed to conclude that y = a + bx2 is the better model. (iv) In general, the estimated regression equation is y−¯ y = b(x−¯ x), where b = ∑ xˆi ∑ yˆi / ∑ xˆ2i . So in this case, the estimated regression equation is y − 10.885714 ≈ −0.856210 (x2 − 13.25) ⇐⇒ y ≈ −0.856210x2 + 22.230492.

y(3.2) = −0.856210 ⋅ (3.2)2 + 22.230492 ≈ 13.5.

Page 1288, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 487 (9740 N2011/II/9). (i) (a) 0.6⋅0.05+0.4⋅0.07 = 0.03+0.028 = 0.058. (i) (b) 0.03/0.058 = 15/29 ≈ 0.517241.

(ii) (a) P(Exactly one faulty) = P(First faulty, second not) + P(Second faulty, first not) = 0.058 (1 − 0.058) + (1 − 0.058) 0.058 = 2 ⋅ 0.058 ⋅ 0.942 = 0.109272. ³¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹·¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹µ ³¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ·¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ µ P(E ∩ F ) (ii) (b)P(Both made by A∣Exactly one faulty) = . P(F ) E

F

But P(E ∩ F ) = P(E)P(F ∣E) = 0.62 (0.05 ⋅ 0.95 + 0.95 ⋅ 0.05) = 0.0342. Hence P(E∣F ) = 0.0342/0.109272 ≈ 0.312980. Answer to Exercise 488 (9740 N2011/II/10). (i) We are given that T ∼ N(5.0, 38.0).

Let X be the time taken to install the component after background music is introduced. Assume that X remains normally distributed with standard deviation 5.0 (these are questionable assumptions, but without these we cannot proceed). That is, X ∼ N (µ0 , 5.02 ). The null hypothesis is H0 ∶ µ0 = 38.0 and the alternative hypothesis is HA ∶ µ0 < 38.0.

√ ¯ < µ0 − Z0.05 σ/ n = (ii) Z0.05 ≈ 1.645. So to reject the null hypothesis, we must have t √ 38.0 − 1.645 ⋅ 5.0/ 50 ≈ 36.8. √ ¯ = 37.1, we must have t¯ = 37.1 > µ0 − Z0.05 σ/ n = (iii) Since the null is not rejected with t √ 38.0 − 1.645 ⋅ 5.0/ n. Rearranging, n < (1.645 ⋅ 5.0/0.9)2 ≈ 83.5. Thus, n ∈ {1, 2, . . . , 83}.

Page 1289, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 489 (9740 N2011/II/11). (i) There are in total C(30, 10) ways to choose the committee. There are C(18, 4) × C(12, 6) ways to choose a committee with exactly 4 women. Hence, the desired probability is ⎛ 18 ⎞ ⎛ 12 ⎞ ⎛ 30 ⎞ [(18 ⋅ 17 ⋅ 16 ⋅ 15) /4!] [(12 ⋅ 11 ⋅ 10 ⋅ 9 ⋅ 8 ⋅ 7) /6!] / = (30 ⋅ 29 ⋅ ⋅ ⋅ ⋅ ⋅ 21) /10! ⎝ 4 ⎠ ⎝ 6 ⎠ ⎝ 10 ⎠ =

17 ⋅ 48 ≈ 0.9410679. 29 ⋅ 13 ⋅ 23

(ii) Numbers of ways to choose a committee with (a) exactly r women; and (b) exactly r + 1 women: (a)

⎛ 18 ⎞ ⎛ 12 ⎞ ; ⎝ r ⎠ ⎝ 10 − r ⎠

(b)

⎛ 18 ⎞ ⎛ 12 ⎞ ⎝ r + 1 ⎠⎝ 9 − r ⎠

We are told that the first number is greater than the second, i.e.

⇐⇒

⎛ 18 ⎞ ⎛ 12 ⎞ ⎛ 18 ⎞ ⎛ 12 ⎞ > ⎝ r ⎠ ⎝ 10 − r ⎠ ⎝ r + 1 ⎠ ⎝ 9 − r ⎠

18! 18! 12! 12! > (18 − r)!r! (2 + r)!(10 − r)! (17 − r)!(r + 1)! (3 + r)!(9 − r)!

⇐⇒ (17 − r)!(r + 1)!(3 + r)!(9 − r)! > (18 − r)!r!(2 + r)!(10 − r)!

(as desired).

Continuing with the algebra, we have (r + 1)(3 + r) > (18 − r)(10 − r) ⇐⇒ r2 + 4r + 3 > r2 − 28r + 180 ⇐⇒ 32r > 177 ⇐⇒ r > 5 + 17/32.

We have just proven that P(R = r) > P(R = r + 1) if and only if r = 6, 7, 8, 9. That is, we have just shown that P(R = 6) > P(R = 7) > P(R = 8) > P(R = 9) > P(R = 10), but P(R = 0) ≤ P(R = 1) ≤ P(R = 2) ≤ P(R = 3) ≤ P(R = 4) ≤ P(R = 5) ≤ P(R = 6).

We have thus shown that 6 is a most-probable-number-of-women and that 7, 8, 9, 10 are not. We must rule out that 5 (or any smaller number) is a most-probable-number-of-women. But clearly, 6!4! ≠ 5!5!, so that ⎛ 18 ⎞ ⎛ 12 ⎞ ⎛ 18 ⎞ ⎛ 12 ⎞ ≠ . ⎝ 6 ⎠⎝ 4 ⎠ ⎝ 5 ⎠⎝ 5 ⎠

Hence, it is indeed the case that P(R = 5) < P(R = 6). Thus, 6 is indeed the unique most-probable-number-of-women. Page 1290, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 490 (9740 N2011/II/12). (i) Let X be the number of people who join the queue in a period of 4 minutes. Then X ∼ Po(4.8) and 7

4.8i ≈ 0.113334. i=0 i!

P(X ≥ 8) = 1 − P(X ≤ 7) = 1 − e−4.8 ∑

(ii) Let Y be the number of people who join the queue in a period of t minutes. Then Y ∼ Po(1.2t/60) = Po(0.02t). We are told that P(Y ≤ 1) = 0.7. That is, By calculator, t ≈ 54.8675.

P(Y ≤ 1) = e−0.02t (1 + 0.02t) = 0.7.

(iii) Let Z be the number of people who leave the queue over 15 minutes. Then Z ∼ Po(27).

Let B be the number of people who join the queue over 15 minutes. Then B ∼ Po(18).

We wish to find P(35 + B − Z ≥ 24) = P(Z − B ≤ 11).

Since λZ = 27 is large, a suitable approximation for Z is the normal distribution is A ∼ N(27, 27). Since λB = 18 is large, a suitable approximation for B is the normal distribution is C ∼ N(18, 18). In turn, a suitable approximation for Z − B is A − C ∼ N(9, 45). Hence, using also the continuity correction, 11.5 − 9 P(Z − B ≤ 11) ≈ P(A − C ≤ 11.5) = Φ ( √ ) ≈ Φ(0.3727) ≈ 0.6453. 45 (iv) There might be certain periods of time when more planes arrive and other periods when fewer arrive. So the rate at which people join the queue will probably not be constant.

Answer to Exercise 491 (9740 N2010/II/5). (i) Say we wish to stratify the spectators by age group. One problem is that we may not know what proportion of the spectators belongs to each age group. As such, it would may be difficult to get a representative sample.

(ii) Order the spectators by their names, alphabetically. Choose every 100th spectator on the list to survey. Page 1291, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 492 (9740 N2010/II/6).

(i)

∑ t 454.3 t¯ = = = 41.3, n 11

∑ t2 − (∑ t) /11 18779.43 − 454.32 /11 s = = = 1.684. n−1 10 2

2

(ii) The null hypothesis is H0 ∶ µ0 = 42.0 and the alternative hypothesis is HA ∶ µ0 ≠ 42.0. T=

t¯ − µ0 41.3 − 42.0 √ =√ ≈ −1.789. s/ n 1.684/11

Since ∣T ∣ < t10,0.05 = 1.812, we are unable to reject the null hypothesis. Answer to Exercise 493 (9740 N2010/II/7). 0.8 ⋅ 0.4 = 0.32.

(i) P(A ∩ B ′ ) = P(A∣B ′ )P(B ′ ) =

(ii) P(A ∪ B) = P(B) + P(A ∩ B ′ ) = 0.92.

(iii) P(B ′ ∣A) = P(B ′ ∩ A) ÷ P(A) = 0.32 ÷ 0.7 = 16/35 ≈ 0.457142857. (iv) P(A′ ∩ C) = P(A′ )P(C) = 0.3 ⋅ 0.5 = 0.15. (v) P(A′ ∩ B ∩ C) ≤ 0.15.

Page 1292, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 494 (9740 N2010/II/8). (i) The probability that the number is greater than 30000 is the probability that the first digit is 3, 4, or 5. Answer: 3/5 = 0.6. (ii) The first three digits are odd and there are 3! ways to arrange them. The last two are even and there are 2! ways to arrange them. The total number of ways to arrange the five digits is 5!. Answer: 3!2!/5! = 1/10 = 0.1. (iii) If the first digit is 3, the last digit must be 1or 5, and in each case, there are 3! ways to arrange the middle 3 digits. Similarly, if the first digit is 5, the last digit must be 1 or 3, and in each case, there are 3! ways to arrange the middle 3 digits. If the first digit is 4, the last digit can be 1, 3, or 5, and in each case, there are 3! ways to arrange the middle 3 digits. Altogether then, there are 7 ⋅ 3! ways to get such a number and the desired probability is 7 ⋅ 3!/5! = 7/20 = 0.35.

Answer to Exercise 495 (9740 N2010/II/9). (i) Our desired probability is P(Y > 2X) = P(Y − 2X > 0). Now, Y − 2X ∼ N (400 − 2 ⋅ 180, 602 + 22 302 ) = N (40, 7200). So 0 − 40 ) ≈ Φ(0.4714) ≈ 0.6813. P(Y − 2X > 0) = 1 − Φ ( √ 7200

(ii) Our desired probability is P(0.12X + 0.05Y > 45). Now,

0.12X + 0.05Y ∼ N (0.12 ⋅ 180 + 0.05 ⋅ 400, 0.122 ⋅ 302 + 0.052 ⋅ 602 ) = N (41.6, 21.96)

45 − 41.6 Ô⇒ P(0.12X + 0.05Y > 45) = 1 − Φ ( √ ) ≈ 1 − Φ(0.7255) ≈ 1 − 0.7658 = 0.2342. 21.96 (iii) Our desired probability is P (0.12X1 + 0.12X2 > 45). Now,

0.12X1 + 0.12X2 ∼ N (0.12 ⋅ 180 + 0.12 ⋅ 180, 0.122 ⋅ 302 + 0.122 ⋅ 302 ) = N(43.2, 25.92) 45 − 43.2 P (0.12X1 + 0.12X2 > 45) = 1 − Φ ( √ ) ≈ 1 − Φ(0.3536) ≈ 1 − 0.6381 = 0.3619. 25.92

Page 1293, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 496 (9740 N2010/II/10).

(i)

F

V

(ii) (a) PMCC ≈ 0.986024.

(ii) (b) PMCC ≈ 0.990681. (iii) We are, as usual, supposed to say that the larger the magnitude of the PMCC, the better the model. So F = c + dv 2 is the better model.

(iv) In general, the estimated regression equation is y−¯ y = b(x−¯ x), where b = ∑ xˆi ∑ yˆi / ∑ xˆ2i . So in this case, the estimated regression equation is

And F = 26.0 ⇐⇒ x ≈

√

F − 14.25 ≈ 0.0242420 (x2 − 456) ⇐⇒ F ≈ 0.0242420x2 + 3.195652.

(26.0 − 3.195652) /0.0242420 ≈ 30.7.

To predict a value of v given a value of F , it would be more appropriate to use a regression where v (or a function of v) is the independent variable and F (or a function of F ) is the dependent variable.

Page 1294, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 497 (9740 N2010/II/11). (i) Let X be the number of calls received in a randomly chosen period of 4 minutes. Then X ∼ Po(12) and 8 −12 12

P(X = 8) = e

8!

≈ 0.0655233.

(ii) Let Y be the number of calls received in a randomly chosen period of t seconds. Then Y ∼ Po(3t/60) = Po(0.05t) and P(Y = 0) = e−0.05t = 0.2. So t = (ln 0.2) /(−0.05) ≈ 32. 8 −12 12

P(Y = 0) = e

8!

≈ 0.0655233.

(iii) Let Z be the number of calls received in a randomly chosen period of 12 hours. Then Z ∼ Po(2160) and a suitable approximation therefor is the normal distribution A ∼ N (2160, 2160). Hence, using also the continuity correction, P(Z > 2200) ≈ P(A ≥ 2200.5) = 1 − Φ ( (iv)

2200.5 − 2160 √ ) ≈ 1 − Φ (0.8714) ≈ 1 − 0.8082 = 0.1918. 2160

⎛6⎞ 0.19182 0.80824 ≈ 0.2354. ⎝2⎠

(v) Let B be the number of busy days out of 30. Since np ≈ 5.754 > 5 and n(1 − p) > 5, a suitable approximation to B is the normal distribution C ∼ N (5.754, 4.650). So using also the continuity correction, P(B ≤ 10) ≈ P(C ≤ 10.5) = Φ (

10.5 − 5.754 √ ) ≈ Φ(2.201) ≈ 0.9861. 4.650

(Without using any approximation, P(B ≤ 10) ≈ 0.980906.)

Answer to Exercise 498 (9740 N2009/II/5). Simply survey people standing outside the theatre waiting for the movie to start. Stop once the quota of 100 persons is met. A disadvantage is that this may not be a representative sample. For example, there will be no late-comers in our sample of 100. Page 1295, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 499 (9740 N2009/II/6).

(i)

t

x

(ii) No. A linear model would imply that several centuries hence, the time taken to run a mile would be negative, which is clearly impossible. The scatter diagram similarly suggests that the rate of improvement is tapering off, rather than linear.

(iii) A quadratic model would imply that the world record time taken to run a mile eventually bottoms out, then starts increasing. But by definition, it is impossible that the world record time increases. (iv) In general, the estimated regression equation is y−¯ y = b(x−¯ x), where b = ∑ xˆi ∑ yˆi / ∑ xˆ2i . So in this case, the estimated regression equation is ln t − 3.161647 ≈ −0.0161280(x − 1965) ⇐⇒ ln t ≈ −0.0161280x + 34.853071.

t(2010) ≈ e−0.0161280(2010)+34.853071 ≈ 11.4. So the predicted world record time on 1st January 2010 is 3 m 41.4 s.

Our range of data is 1930-2000. We are extrapolating our data, which might not always work out reliably.

Page 1296, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 500 (9740 N2009/II/7). (i) Let E and F be the events that “a randomly chosen component that is faulty” and “a randomly chosen component was supplied by A”. Then P(E) = 0.01p ⋅ 0.05 + 0.01(1 − p)0.03 = 0.03 + 0.02 ⋅ 0.01p = 0.035 (ii) f (p) = P(F ∣E) =

P(F ∩ E) 0.01p ⋅ 0.05 0.05p 7.5 = = = 2.5 − . F (E) 0.03 + 0.02 ⋅ 0.01p 3 + 0.02p 3 + 0.02p

f ′ (p) = 7.5(3 + 0.02p)−2 (0.02) > 0. This shows that the probability that a randomly chosen component that is faulty was supplied by A is increasing in the percentage of electronic components bought from A. Which is not very surprising.

Answer to Exercise 501 (9740 N2009/II/8). (i) We have 8 letters total, 3 of which are repeated. Hence, there are 8!/3! = 6720 possible permutations. (ii) Let TD or DT be a single letter. Then we have 7 “letters” total, 3 of which are repeated, so there are 2! × 7!/3! possible permutations that we do not want. So there are 6720 − 2! × 7!/3! = 5040 possible permutations that we do want. (iii) The 4 consonants by themselves have 4! possible permutations. The 4 vowels by themselves have 4! ÷ 3! = 4 possible permutations. The first letter can either be a consonant or a vowel. Hence, there are in total 2 × 4! × 4 = 192 possible permutations. (iv) There are only four broad possibilities: E _ _ E _ _ E _ , E _ _ E _ _ _ E, E _ _ _E _ _ E, and _ E _ _E _ _ E. Each of which have 5! possible permutations. Hence, there are in total 4 × 5! = 480 possible permutations.

Page 1297, Table of Contents

www.EconsPhDTutor.com

¯ ∼ N (2.5, 0.1 ). So P (M ¯ > 2.53) = Answer to Exercise 502 (9740 N2009/II/9). (i) M n √ √ √ ⎛ 2.53 − 2.5 ⎞ = 1 − Φ (0.3 n) = 0.0668 ⇐⇒ Φ (0.3 n) = 0.9332 ⇐⇒ 0.3 n = 1.5 1−Φ √ ⎝ 0.12 /n ⎠ ⇐⇒ n = 25. 2

(ii) Assuming the thicknesses of the textbooks are independently distributed, X = M1 +⋅ ⋅ ⋅+M21 +S1 +. . . S24 ∼ N (21 ⋅ 2.5 + 24 ⋅ 2.0, 21 ⋅ 0.12 + 24 ⋅ 0.082 ) = N (100.5, 0.3636) .

100 − 100.5 ) ≈ 1 − Φ (0.8292) ≈ 1 − 0.7964 = 0.2036. Now, P(X ≤ 100) = Φ ( √ 0.3636

(iii) Again assuming the thicknesses of the textbooks are independently distributed, our desired probability is P (S1 + S2 + S3 + S4 < 3M ) = P (S1 + S2 + S3 + S4 − 3M < 0). Now, S1 + S2 + S3 + S4 − 3M ∼ N (4 ⋅ 2.0 − 3 ⋅ 2.5, 4 ⋅ 0.082 + 32 ⋅ 0.12 ) = N (0.5, 0.1156). Hence, 0 − 0.5 ) ≈ 1 − Φ (1.4706) ≈ 1 − 0.9293 = 0.0707. P (S1 + S2 + S3 + S4 − 3M < 0) = Φ ( √ 0.1156

(iv) The thicknesses of the textbooks are independently distributed.

Page 1298, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 503 (9740 N2009/II/10). x¯ =

(i)

∑ x 86.4 = = 9.6, n 9

∑ x2 − (∑ x) /n 835.92. − 86.42 /9 s = = ≈ 0.81. n−1 8 2

2

(ii) A necessary assumption is that X is normally distributed. The null hypothesis is H0 ∶ µ0 = 10 and the alternative hypothesis is HA ∶ µ0 ≠ 10. t=

x¯ − µ0 9.6 − 10 4 √ =√ =− . 3 s/ n 0.81/9

Since ∣t∣ < t8,0.025 = 2.306, we are unable to reject the null hypothesis.

The sample size is small. And so we are unable to appeal to the CLT and claim that a normal distribution is a suitable approximate distribution for x¯. (Author’s remark: It actually makes no sense to say that “the CLT does not apply in this context”. The CLT certainly applies. It is merely that the normal distribution is a poor approximation for the sample mean.)

(iii) We’d use the Z-test instead.

Page 1299, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 504 (9740 N2009/II/11). (i) The probability that any observed car is red is independent of whether any other observed car is red. Each car is either strictly red or strictly not red. (ii) =

P(4 ≤ R < 8)

⎛ 20 ⎞ ⎛ 20 ⎞ ⎛ 20 ⎞ ⎛ 20 ⎞ 0.154 0.8516 + 0.155 0.8515 + 0.156 0.8514 + 0.157 0.8513 ⎝ 4 ⎠ ⎝ 5 ⎠ ⎝ 6 ⎠ ⎝ 7 ⎠

≈ 0.346354.

(iii) Since np and n(1−p) are large, a suitable approximation to R is the normal distribution X ∼ N (72, 50.4). Hence, using also the continuity correction, 59.5 − 72 ) ≈ 1 − Φ (1.761) ≈ 1 − 0.9609 = 0.0391. P(R < 60) ≈ P(X < 59.5) = Φ ( √ 50.4

(iv) Since n is large and p is small, a suitable approximation to R is the normal distribution Y ∼ Po (4.8). Hence, P(R = 3) = e−4.8

(v)P(R = 0) + P(R = 1) = By calculator, p ≈ 0.142432.

Page 1300, Table of Contents

4.83 ≈ 0.152. 3!

⎛ 20 ⎞ 0 ⎛ 20 ⎞ 1 p (1 − p)20 + p (1 − p)19 ⎝ 0 ⎠ ⎝ 1 ⎠

= (1 − p)19 (1 − p + 20p) = 0.2.

www.EconsPhDTutor.com

Answer to Exercise 505 (9740 N2008/II/5). pupils. From the list, pick every 19th student.

(i) Take any ordered list of the 950

(ii) We might want each level to be equally well-represented. For example, we might like approximately one-sixth of the sample to be from Primary 1, another sixth from Primary 2, etc. In which case we’d probably prefer to do a stratified sample. The method might be something like this: Pick from the aforementioned ordered list the first 108 Primary 1 students, the first 108 Primary 2 students, etc.

Answer to Exercise 506 (9740 N2008/II/6). Let the mass of calcium in a bottle (after the extreme weather) be X ∼ N (µ0 , σ 2 ). (We have made the necessary assumption that X is normally distributed.) The null hypothesis is H0 ∶ µ0 = 78 and the alternative hypothesis is H0 ∶ µ0 ≠ 78. Now, t=

x¯ − µ0 ∑ x/n − 78 √ =√ √ ≈ −1.207. 2 s/ n 2 [∑ x − (∑ x) /n] /(n − 1)/ n

Since ∣t∣ < t14,0.025 ≈ 2.145, we are unable to reject the null hypothesis.

Answer to Exercise 507 (9740 N2008/II/7). (i) Let A1 denote the event that A wins the first set. Similarly define A2 , A3 , B1 , B2 , and B3 . P (A2 ) = P (A1 ∩ A2 ) + P (B1 ∩ A2 ) = 0.6 ⋅ 0.7 + 0.4 ⋅ 0.2 = 0.5. (ii) P (A wins) = P (A1 ∩ A2 ) + P (A1 ∩ B2 ∩ A3 ) + P (B1 ∩ A2 ∩ A3 ) = 0.42 + 0.6 ⋅ 0.3 ⋅ 0.2 + 0.4 ⋅ 0.2 ⋅ 0.7 = 0.42 + 0.036 + 0.056 = 0.512. (iii) P (B1 ∩ A2 ∩ A3 ) /P (A wins) = 0.056/0.512 = 0.109375.

Page 1301, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 508 (9740 N2008/II/8). (i) PMCC ≈ 0.9695281468. This large PMCC merely suggests that there is a strong (positive) linear relationship between x and t. However, the true relationship between x and t could be something other than linear.

(ii)

t P

x

(iii) Without P , it appears that t is increasing, but at a decreasing rate. So a log model might be appropriate. (iv) In general, the estimated regression equation is y − y¯ = b(x − x¯), where b = ∑ xˆi ∑ yˆi / ∑ xˆ2i .

So in this case, the estimated regression equation is

t − 6.45 ≈ 4.396563 (ln x − 1.143002) ⇐⇒ t ≈ 4.396563 ln x + 1.424722.

So for the model t = a + b ln x, the least square estimates are a ≈ 1.4 and b ≈ 4.4.

(v) t(x = 4.8) ≈ 4.4 ln(4.8) + 1.4 ≈ 8.3.

(vi) This would be an extrapolation of the data, which may or may not be wise.

Page 1302, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 509 (9740 N2008/II/9). P(X ≥ 4) = 1 − P(X ≤ 3) = 1 − e

−1.8

(i) X ∼ Po(1.8).

1.80 1.81 1.82 ( + + ) ≈ 0.108708. 0! 1! 2!

(ii) Let Y be the total number of pianos sold in a given week. Then Y ∼ Po(4.4). P(Y = 4) = e−4.4 4.44 /4! ≈ 0.191736.

(iii) Let Z be the number of grand pianos sold in 50 weeks. Then Z ∼ Po(90). Since λZ is large, a suitable approximation is the normal distribution A ∼ N (90, 90). Hence, using also the continuity correction, P(Z < 80) ≈ P(A < 79.5) = Φ (

79.5 − 90 √ ) ≈ 1 − Φ(1.1068) ≈ 1 − 0.8657 = 0.1343. 90

(iv) An organisation might buy a relatively-large number of grand pianos on any given day. So it is not likely that the rate at which grand pianos are sold is constant throughout the year.

Page 1303, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 510 (9740 N2008/II/10). (i)

(ii)

(iii)

⎛9⎞ = 9. ⎝8⎠

⎛ 3 ⎞⎛ 4 ⎞⎛ 5 ⎞ = 3 ⋅ 4 ⋅ 10 = 120. ⎝ 2 ⎠⎝ 3 ⎠⎝ 3 ⎠

⎛ 5 ⎞⎛ 7 ⎞ ⎛ 5 ⎞⎛ 7 ⎞ = 5 ⋅ 35 + 1 × 35 = 210. + ⎝ 4 ⎠⎝ 4 ⎠ ⎝ 5 ⎠⎝ 3 ⎠

(iv) The number of ways to have

• No diplomats from K (i.e. only diplomats from L and M ) is • No diplomats from L is

⎛8⎞ ; ⎝8⎠

⎛9⎞ ; ⎝8⎠

• No diplomats from M is 0.

⎛ 12 ⎞ . Hence the number of ways to ⎝ 8 ⎠ ⎡ ⎤ ⎛ 12 ⎞ ⎢⎢⎛ 9 ⎞ ⎛ 8 ⎞⎥⎥ have at least 1 diplomat from each island is − + = 495 − (9 + 1) = 485. ⎝ 8 ⎠ ⎢⎢⎝ 8 ⎠ ⎝ 8 ⎠⎥⎥ ⎣ ⎦ The total number of ways to choose the diplomats is

Page 1304, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 511 (9740 N2008/II/11). (i) X1 + X2 ∼ N (100, 2 ⋅ 82 ). So 120 − 100 ) ≈ 1 − Φ(1.768) ≈ 1 − 0.9615 = 0.0385. P(X1 + X2 > 120) = 1 − Φ ( √ 2 ⋅ 82 (ii) X1 − X2 ∼ N (0, 2 ⋅ 82 ). So

15 − 0 ) ≈ 1 − Φ(1.3258) ≈ 1 − 0.9075 = 0.0925. P(X1 > X2 + 15) = P(X1 − X2 > 15) = 1 − Φ ( √ 2 ⋅ 82

(iii) P(Y < 74) = Φ ( P(Y > 146) = 1 − Φ (

74 − µ 74 − µ ) = 0.0668 ⇐⇒ = −1.5. σ σ

146 − µ 146 − µ 146 − µ ) = 0.0668 ⇐⇒ Φ ( ) = 0.9332 ⇐⇒ = 1.5. σ σ σ

146 − µ 74 − µ 72 − = 1.5 − (−1.5) = = 3 ⇐⇒ σ = 24 and µ = 110. σ σ σ

Since σ = 8a and µ = 50a + b, a = 3 and b = −40.

Answer to Exercise 512 (9233 N2008/I/1). 3 ways to arrange the 3 groups of books. And within each group of books, we can permute them as usual. So there are 3!6!5!4! = 12441600 ways. Answer to Exercise 513 (9233 N2008/II/23). By independence, pA∩B = pA pB . Also pA∪B = pA + pB − pA∩B = pA + pB − pA pB . Plugging in the given numbers, we have 0.4 = 0.2 + pB − 0.2pB , so pB = 0.25. pB pC = 0.25 ⋅ 0.4 = 0.1 = pB∩C , so that by definition, B and C are indeed independent.

Page 1305, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 514 (9233 N2008/II/26). (i) Let X ∼ Po(3). P(X > 2) = −3 −3 1 − P(X ≤ 0) = 1 − e (1 + 3 + 9/2) = 1 − 8.5e ≈ 1 − 0.423 = 0.577. (ii) Let Y be the number of times the machine will break down in a period of four weeks. Then Y ∼ Po(12). P(Y ≤ 3) = e−12 (1 + 12 + 122 /2 + 123 /6) ≈ 0.00229.

(iii) Let Z be the number of times the machine will break down in a period of 16 weeks. Then Z ∼ Po(48). Since λZ is large, a suitable approximation for Z is the normal distribution A ∼ N (48, 48). Hence, using also the continuity correction, P(Z > 50) ≈ P(A > 50.5) = 1 − Φ (

50.5 − 48 √ ) ≈ 1 − Φ(0.3608) ≈ 1 − 0.6409 = 0.3591. 48

Answer to Exercise 515 (9233 N2008/II/27). (i) Let the mass after the adjustment be X ∼ N (µ0 , σ 2 ). It is necessary to assume that these masses remain normally distributed. The null hypothesis is H0 ∶ µ0 = 32.40 and the alternative hypothesis is HA ∶ µ0 ≠ 32.40. Now, t=

x¯ − µ0 32.00 − 32.40 √ = √ ≈ −2.104. s/ n 2.892/80

Since ∣t∣ > t79,0.025 ≈ 1.99, we can reject the null hypothesis.

(ii) This means that if H0 were true and we tested infinitely many size-80 samples (as done above), we’d reject H0 in 5% of the samples. (iii) The one-tailed p-value is ≈ 0.0193. So the least level of significance is 1.93%.

Page 1306, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 516 (9233 N2008/II/29). (i) Let X ∼ N (50, 42 ). The probability that Mr Sim is late on any given day is P(X > 55) = 1 − Φ (

55 − 50 ) = 1 − Φ(1.25) ≈ 1 − 0.8944 = 0.1056. 4

Assuming that the probability that he’s late each day is independent of whether he was late on any other day, the probability that he will be late no more than once in 5 days is ⎛5⎞ ⎛5⎞ 0.10560 0.89445 + 0.10561 0.89444 ≈ 0.910. ⎝0⎠ ⎝1⎠ (ii) Let Y ∼ N (40, 52 ). Our desired probability is P(X − Y − 5 < 0). Assuming the journey times of Messrs Sim and Lee are independent, X − Y − 5 ∼ N (5, 42 + 52 ). Thus, 0−5 ) ≈ 1 − Φ(0.7809) ≈ 1 − 0.7826 = 0.2174. P(X − Y − 5 < 0) = Φ ( √ 42 + 52

(iii) Assume that the journey times of Messrs Sim and Lee each day are independent. Then the desired probability is ⎛5⎞ ⎛5⎞ ⎛5⎞ 0.21743 0.78262 + 0.21744 0.78261 + 0.21745 0.7826 ≈ 0.0722. ⎝3⎠ ⎝4⎠ ⎝5⎠

Page 1307, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 517 (9233 N2008/II/30). 86.50 − µ 86.50 − µ 1 86.50) = Φ ( ) = 0.12 ⇐⇒ = −1.175. σ σ

P(M > 92.25) = 1 − Φ (

=minus = yields 2

1

(i) Let M ∼ N (µ, σ 2 ). P(M <

92.25 − µ 92.25 − µ 2 92.25 − µ ) = 0.2 ⇐⇒ Φ ( ) = 0.8 ⇐⇒ = 0.842. σ σ σ

5.75 = 2.017 ⇐⇒ σ ≈ 2.85. And now µ ≈ 89.85. σ

2 (ii) Let X ∼ N (µ, σ 2 ). P (µ − 2 ≤ X ≤ µ + 2) = 0.8 Ô⇒ P (X ≤ µ + 2) = 0.9 ⇐⇒ Φ ( ) = σ 2 0.9 ⇐⇒ ≈ 1.281 ⇐⇒ σ ≈ 1.56. σ

σ2 ¯ ≥ µ + 0.50) ≤ 0.1 ⇐⇒ 1 − Φ ( 0.50 ¯ √ ) ≤ 0.1 ⇐⇒ (iii) Let X ∼ N (µ, ). Then P(X n σ/ n √ √ √ √ 0.50 n 0.50 n 2 0.50 n ) ⇐⇒ ? 1.281 ⇐⇒ ≥ ⇐⇒ 0.50 n ≥ 2 ⇐⇒ n ≥ 16. 0.9 ≤ Φ ( σ σ σ σ Answer to Exercise 518 (9740 N2007/II/5). (i) Consider a survey of whether students like a particular teacher. A quota of 10 students is to be chosen. Take a list of the teacher’s students, sort their names alphabetically, and pick the first 10 students on the list. One disadvantage is that this sample of 10 students might not be representative. For example, they might all be siblings from the same family of Angs.

(ii) Yes. If say the teacher teaches 10 different classes, we could stratify our sample by class and pick 1 student from each class.

Page 1308, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 519 (9740 N2007/II/6).

(i)

⎛ 10 ⎞ ⎛ 10 ⎞ ⎛ 10 ⎞ 0.240 0.7610 + 0.241 0.769 + ⋅ ⋅ ⋅ + + 0.244 0.766 ≈ 0.933. ⎝ 0 ⎠ ⎝ 1 ⎠ ⎝ 4 ⎠ (ii) Let X ∼ B(1000, 0.24) be the number of people in a sample of 1000 that have gene A. Since np = 240 > 5 and n(1 − p) = 760 > 5 are both large, a suitable approximation for X is the normal distribution Y ∼ N (240, 182.4). Hence, using also the continuity correction, 260.5 − 240 229.5 − 240 P(230 ≤ X ≤ 260) ≈ P(229.5 ≤ Y ≤ 260.5) = Φ ( √ ) − Φ( √ ) 182.4 182.4 ≈ Φ(1.5179) − Φ(−0.7775) ≈ 0.9355 − 0.2180 ≈ 0.7175.

(iii) Let Z ∼ B(1000, 0.003) be the number of people in a sample of 1000 that have gene B. Since n is large and p is small, a suitable approximation for Y is the Poisson distribution A ∼ Po (3). Hence, P(2 ≤ Z < 5) ≈ P(2 ≤ Y < 5) = P(Y = 2) + P(Y = 3) + P(Y = 4) = e−3 (32 /2 + 33 /6 + 34 /24) ≈ 0.616.

Answer to Exercise 520 (9740 N2007/II/7).

(i)

∑ x2 − (∑ x) /n ∑ x 4626 2 = = 30.84 and s = ≈ 33.7259. x¯ = n 150 n−1 2

(ii) Let H0 ∶ µ0 = 30 and HA ∶ µ0 > 30 be the null and alternative hypotheses. Now, Z=

x¯ − µ0 30.84 − 30 √ ≈√ ≈ 1.772. s/ n 33.7259/150

Since Z > Z0.05 = 1.645, we can reject the null hypothesis.

(iii) We used the Z-test. The sample size is large, so the normal distribution is a good approximation provided the underlying distribution is “nice enough”. Page 1309, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 521 (9740 N2007/II/8). (i) Let C be the weight of a randomly chosen chicken. Then C ∼ N (2.2, 0.52 ). Then 3C ∼ N (3 ⋅ 2.2, 32 ⋅ 0.52 ) = N (6.6, 1.52 ) and P(3C > 7) = 1 − Φ (

7 − 6.6 4 ) = 1 − Φ ( ) ≈ 0.3949. 1.5 15

(ii) Let T be the weight of a randomly chosen turkey. Then T ∼ N (10.5, 2.12 ). Then 5T ∼ N (5 ⋅ 10.5, 52 ⋅ 2.12 ) = N (52.5, 10.52 ) and P(5T > 55) = 1 − Φ (

Thus, P(3C > 7) ⋅ P(5T > 55) ≈ 0.160.

5 55 − 52.5 ) = 1 − Φ ( ) ≈ 0.405904. 10.5 21

(iii) 3C + 5T ∼ N (6.6 + 52.5, 1.52 + 10.52 ) = N (59.1, 112.5). So

62 − 59.1 5 P(3C + 5T > 62) = 1 − Φ ( √ ) = 1 − Φ ( ) ≈ 0.392. 21 112.5

(iv) The event “both chicken costs more than $7 and turkey costs more than $55” is a proper subset of the event “chicken and turkey together cost $62”. By the monotonicity of probability, the probability of the latter is greater than the latter.

Answer to Exercise 522 (9740 N2007/II/9).

(i) (a) 12!. (b) 6! ⋅ 26 .

(ii) (a) 11!. (ii) (b) Fix any man. Then we must have to his right: Woman, man, woman, man, etc. So 6!5!. (ii) (c) Fix any man A. Then we must have • To his right: “Wife A, some other man, that some other man’s wife, etc.”; OR • To his left: “Wife A, some other man, that some other man’s wife, etc.”. In the first scenario, we have 5! possible arrangements. Likewise in the second. Altogether 2 ⋅ 5! possible arrangements. Page 1310, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 523 (9740 N2007/II/10).

(i) P(1, 1, 1) =

1 1 1 1 ⋅ ⋅ = . 8 4 2 64

(ii) P(1, 1) + P(1, 0, 1) + P(0, 1, 1) =

1 1 1 3 1 7 1 1 8 + 2 ⋅ 3 + 7 21 ⋅ + ⋅ ⋅ + ⋅ ⋅ = = . 8 4 8 4 4 8 8 4 256 256

(iii) Let E and F be the events that “the third throw is successful” and “exactly two of the three throws are successful”. P(E ∩ F ) = P(1, 0, 1) + P(0, 1, 1) =

P(F ) = P(E ∩ F ) + P(E ′ ∩ F ) =

1 3 1 7 1 1 13 ⋅ ⋅ + ⋅ ⋅ = . 8 4 4 8 8 4 256

13 17 + P(1, 1, 0) = . 256 256

Thus, P(E∣F ) = P(E ∩ F ) ÷ P(F ) = 13/17.

Page 1311, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 524 (9740 N2007/II/11). (i) In general, the estimated regression equation is y − y¯ = b(x − x¯), where b = ∑ xˆi ∑ yˆi / ∑ xˆ2i . So in this case, the estimated regression equation is x − 131.666667 ≈ −0.259701 (t − 32) ⇐⇒ x ≈ −0.259701t + 66.194030.

(ii) x(t = 300) ≈ −0.259701 ⋅ (300) + 66.194030 ≈ −11.7.

From the scatter diagram, the linear model does not appear to be suitable. Moreover, the linear model predicts that at t = 300, x < 0, which is impossible.

(iii) PMCC ≈ −0.993839. Its magnitude is larger than −0.912 and very close to −1. It would appear that the regression of ln x on t is a more appropriate model. (iv) In general, the estimated regression equation is y−¯ y = b(x−¯ x), where b = ∑ xˆi ∑ yˆi / ∑ xˆ2i . So in this case, the estimated regression equation is ln x − 2.995391 ≈ −0.0123434 (t − 131.666667) ⇐⇒ ln x ≈ −0.0123434t + 4.620609 (v) x = 15 Ô⇒ t ≈ (4.620609 − ln 15) /0.0123434 ≈ 155. Answer to Exercise 525 (9233 N2007/I/4). (i) It cannot be that all three vertices are collinear. Thus, one vertex must be chosen from the upper line segment and the other must be chosen from the lower line segment. Hence, there are 3 × 6 = 18 possible triangles. (ii) Consider triangles that do not have A as a vertex. Two vertices must be chosen from one ⎛3⎞ ⎛6⎞ line segment and the third must be chosen from the other. So there are ⋅7+4⋅ = ⎝2⎠ ⎝2⎠ 21 + 60 = 81 possible triangles. Now, including also triangles with A as a vertex, we have 99 possible triangles.

Page 1312, Table of Contents

www.EconsPhDTutor.com

¯ 100 ∼ Answer to Exercise 526 (9233 N2007/II/23). (i) X ∼ (30, 52 ) Ô⇒ X (30, 52 /100) = (30, 0.25). Since the sample size is sufficiently large, by the Central Limit ¯ 100 is the normal distribution Y ∼ N (30, 0.25). So Theorem, a suitable approximation for X ¯ 100 ≤ 30.8) ≈ P (29.2 ≤ Y ≤ 30.8) ≈ 0.945201 − 0.054799 ≈ 0.890. P (29.2 ≤ X

(ii) The distribution is “sufficiently nice” that with a sample size of 100, it is appropriate to use the CLT.

Answer to Exercise 527 (9233 N2007/II/25). (i) P(W ∣B) = 20/52 = 5/13 ≈ 0.384615.

(ii) P(B∣W ) = 20/40 = 0.5.

(iii) P(B ∪ W ) = (40 + 32)/90 = 72/100 = 0.72.

(iv) P(W )P(B) = 0.4 ⋅ 0.52 ≠ P(B ∪ W ) and so W and B are not independent.

There are men who take chemistry (equivalently, P(M ∩ C) ≠ 0), so M and C are not mutually exclusive.

Answer to Exercise 528 (9233 N2007/II/26). (i) Let X be the number of genuine call-outs in a randomly chosen two-week period. Then X ∼ Po(4) and P(X < 6) = e−4 (1 + 4 +

42 43 44 45 + + + ) ≈ 0.785130. 2! 3! 4! 5!

(ii) Let Y be the total number of call-outs in a randomly chosen six-week period. Then Y ∼ Po(15) and since λY is large, a suitable approximation for Y is the normal distribution Z ∼ N (15, 15). Hence, using also the continuity correction, P(Y > 19) ≈ P(Z > 19.5) ≈ 1 − Φ (

Page 1313, Table of Contents

19.5 − 15 √ ) ≈ 0.123. 15

www.EconsPhDTutor.com

Answer to Exercise 529 (N2007/II/27-9233). N (8, 0.0125). So

(i) L + H ∼ N (5 + 3, 0.12 + 0.052 ) =

8.2 − 8.0 7.9 − 8.0 P(7.9 ≤ L + H ≤ 8.2) = Φ ( √ ) − Φ (√ ) ≈ 0.963 − 0.185 ≈ 0.778. 0.0125 0.0125

(ii) 0.74L + 0.86H ∼ N (0.74 ⋅ 5 + 0.86 ⋅ 3, 0.742 ⋅ 0.12 + 0.862 ⋅ 0.052 ) = N (6.28, 0.00728225). 6.2 − 6.28 6.1 − 6.28 So, P(6.1 ≤ 0.74L + 0.86H ≤ 6.2) = Φ ( √ ) − Φ (√ ) 0.00728225 0.00728225 ≈ 0.183 − 0.021 ≈ 0.162.

Answer to Exercise 533 (9233 N2006/II/26). (i) Let X be the number of severe floods in a randomly-chosen 100-year period. Then X ∼ Po(2). So [P(X = 1)] = (e−2 ⋅ 2) = 4e−4 ≈ 0.0733. 2

2

(ii) Let Y be the number of severe floods in a randomly-chosen 1000-year period. Then Y ∼ Po(20). Since λY is large, a suitable approximation for Y is the normal distribution Z ∼ N (20, 20). Hence, using also the continuity correction, P(Y > 25) ≈ P(Z > 25.5) = 1 − Φ (

25.5 − 20 √ ) ≈ 0.109. 20

Answer to Exercise 530 (9233 N2006/I/4). We could have: • All three identical — 1 possibility. • Two identical — 5 possibilities. • One identical —

⎛5⎞ = 10 possibilities. ⎝2⎠

• None identical —

⎛5⎞ = 10 possibilities. ⎝3⎠

So total 26 possibilities. Page 1314, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 531 (9233 N2006/II/23).

(i) P(A) = 1/3.

The sum of two scores is 9 if the dice are (3, 6), (4, 5), (5, 4), or (6, 3). So P(B) = 4/36 = 1/9.

P(A ∩ B) = 2/36 = 1/18 ≠ P(A)P(B), so A and B are not independent. (ii) P(A ∪ B) = P(A) + P(B) − P(A ∩ B) = 1/3 + 1/9 − 1/18 = 7/18.

Answer to Exercise 532 (9233 N2006/II/25). (i) The null hypothesis is H0 ∶ µ = µ0 = 10000 and the alternative hypothesis is HA ∶ µ < 10000 Z=

x¯ − µ0 ∑(x − 10000)/n + 10000 − µ0 √ =√ √ 2 s/ n {∑(x − 10000)2 − [∑(x − 10000)] /n} /(n − 1)/ n

−2510/80 + 10000 − 10000 =√ √ ≈ −1.795. 2 {2010203 − (−2510) /80} /79/ 80

Since ∣Z∣ > Z0.05 = 1.645, we can reject the null hypothesis.

(ii) If H0 is true and we conduct the above test on infinitely-many size-80 samples, we’d (falsely) reject H0 for 5% of the samples.

Page 1315, Table of Contents

www.EconsPhDTutor.com

Answer to Exercise 534 (9233 N2006/II/28). (i) Let the speed of any car (in km h-1 ) be X ∼ N (µ, σ 2 ). We are given that P(X > 125) = 1/80 and P(X < 40) = 1/10. P(X > 125) =

1 125 − µ 1 125 − µ 79 125 − µ 1 ⇐⇒ 1 − Φ ( )= ⇐⇒ Φ ( )= ⇐⇒ ≈ 2.240. 80 σ 80 σ 80 σ P(X < 40) =

≈ minus ≈ yields 1

(ii)

2

1 40 − µ 1 40 − µ 2 ⇐⇒ Φ ( )= ⇐⇒ ≈ −1.282. 10 σ 10 σ

85 ≈ 3.522 ⇐⇒ σ ≈ 24.1 and µ ≈ 70.9. σ

⎛ 10 ⎞ 0 10 ⎛ 10 ⎞ 1 9 ⎛ 10 ⎞ 2 8 ⎛ 10 ⎞ 3 7 0.1 0.9 + 0.1 0.9 + 0.1 0.9 + 0.1 0.9 ≈ 0.987. ⎝ 0 ⎠ ⎝ 1 ⎠ ⎝ 2 ⎠ ⎝ 3 ⎠

(iii) Let Y be the number of cars out of a random sample of 100 that are travelling at speed less than 40 km h-1 . Then Y ∼ B(100, 0.1). Since np = 10 > 5 and n(1 − p) = 90 > 5 are both large, a suitable approximation to Y is the normal distribution Z ∼ N (10, 9). Hence, using also the continuity correction, P(Y ≤ 8) ≈ P(Z ≤ 8.5) = Φ (

8.5 − 10 √ ) = 1 − Φ(0.5) ≈ 1 − 0.6915 = 0.3085. 9

(This is the last page of this textbook.)

Page 1316, Table of Contents

www.EconsPhDTutor.com

I make educational YouTube videos too! Mostly on economics. Do me a favour by checking them out! I’m a newbie at this, so please feel free to leave me a comment if you have any feedback or suggestions. YouTube.com/EconCow

EconCow.com

Tuition Ad I give tuition for any of the following subjects:  Economics  Mathematics  Writing, English, General Paper. I have a PhD in economics (University of Michigan, 2015) and have been teaching and tutoring since 2010. For more information, please visit:

www.EconsPhDTutor.com Or simply email:

[email protected]