*Note: I am experimenting with exporting Jupyter notebooks into a WordPress ready format. This notebook refers specifically to to the Nature Conservancy Kaggle, for classifying fish species, based only on photographs.*

Skip to content
# Month: December 2016

### Jupyter Notebook

### Full Repository

### Univariate Gradient Descent

### Multivariate Gradient Descent

### tl;dr

### Humble Beginnings

**How the Game Works**

**State Representations**

### What Works

**Exploration Approaches**

**Learning Rates & Their Impact**

**Training**

**How to Use my Smart Bird**

### Learning From Scratch

**Citations**

**tl;dr**

**The Algorithms**

**Implementation Notes**

**Aging Algorithm Refresh Rate**

**Aging Algorithm – Page Faults Over Refresh Rate**

**Total Page Faults For SWIM.TRACE – Detail**

**Aging Algorithm – Disk Writes Over Refresh Rate**

**Results & Decisions**

**SWIM.TRACE – Page Faults Over Frame Size**

**SWIM.TRACE – Disk Writes Over Frame Size**

**GCC.TRACE – Page Faults Over Frame Size**

**GCC.TRACE – Disk Writes Over Frame Size**

**Decision:**

**Data Set**

better living through computation.

*Note: I am experimenting with exporting Jupyter notebooks into a WordPress ready format. This notebook refers specifically to to the Nature Conservancy Kaggle, for classifying fish species, based only on photographs.*

Sorry, something went wrong. Reload?

Sorry, we cannot display this file.

Sorry, this file is invalid so it cannot be displayed.

I’ve been messing around with a few things during my time off for the Holidays:

**Kaggle**- This one, for the Nature Conservatory, specifically

**Keras**- For using Convolution Neural Networks, without the headaches

**Jupyter Notebooks**- For presenting data analysis results in Python
- Also experimenting with whether I can port these to WordPress easily

So here’s a quick combination of these things, in the form of a simple guide to using **Keras** on the **Nature Conservatory** image recognition Kaggle.

Hopefully it serves as an easy introduction to get up and running with neural networks for this competition.

Sorry, something went wrong. Reload?

Sorry, we cannot display this file.

Sorry, this file is invalid so it cannot be displayed.

For awhile now, the Computer Science department at my University has offered a class for non-CS students called “**Data Witchcraft**“. The idea, I suppose, is that when you don’t understand how a technology works, it’s essentially “**magic**“. (And when it has to do with computers and *data*, it’s dark magic at that.)

But even as someone who writes programs all day long, there are often tools, algorithms, or ideas we use that we don’t really understand–just take at face value, because well, they seem to work, we’re busy, and learning doesn’t seem necessary when the problem is already sufficiently solved.

One of the more prevalent algorithms of this sort is **Gradient Descent (GD)**. The algorithm is both **conceptually simple** (everyone likes to show rudimentary sketches of a blindfolded stick figure walking down a mountain) and **mathematically rigorous **(next to those simple sketches, we show equations with partial derivatives across n-dimensional vectors mapped to an arbitrarily sized higher-dimensional space).

So most often, after learning about GD, you are sent off into the wild to use it, without ever having programmed it from scratch. From this view, Gradient Descent is a sort of incantation we Supreme Hacker Mage Lords can use to solve complex optimization problems whenever our data is in the right format, and we want a quick fix. (Kind of like Neural Networks and “Deep Learning”…) This is practical for most people who just need to ** get things done**. But it’s also unsatisfying for others (myself included).

However, GD can also be implemented in just a few lines of code (even though it won’t be as highly optimized as an industrial-strength version).

That’s why I’m sharing some implementations of both Univariate and generalized Multivariate Gradient Descent written in simple and annotated Python.

Anyone curious about a working implementation (and with some test data in hand) can try this out to experiment. The code snippets below have print statements built in so you can see how your model changes every iteration.

To download and run the full repo, clone it from here: ** https://github.com/adpoe/Gradient_Descent_From_Scratch **

But the actual algorithms are also extracted below, for ease of reading.

Requires NumPy.

Also Requires data to be in this this format: [(x1,y1), (x2,y2) … (xn,yn)], where Y is the actual value.

This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.

Learn more about bidirectional Unicode characters

def gradient_descent(training_examples, alpha=0.01): | |

""" | |

Apply gradient descent on the training examples to learn a line that fits through the examples | |

:param examples: set of all examples in (x,y) format | |

:param alpha = learning rate | |

:return: | |

""" | |

# initialize w0 and w1 to some small value, here just using 0 for simplicity | |

w0 = 0 | |

w1 = 0 | |

# repeat until "convergence", meaning that w0 and w1 aren't changing very much | |

# –> need to define what 'not very much' means, and that may depend on problem domain | |

convergence = False | |

while not convergence: | |

# initialize temporary variables, and set them to 0 | |

delta_w0 = 0 | |

delta_w1 = 0 | |

for pair in training_examples: | |

# grab our data points from the example | |

x_i = pair[0] | |

y_i = pair[1] | |

# calculate a prediction, and find the error | |

h_of_x_i = model_prediction(w0,w1,x_i) | |

delta_w0 += prediction_error(w0,w1, x_i, y_i) | |

delta_w1 += prediction_error(w0,w1,x_i,y_i)*x_i | |

# store previous weighting values | |

prev_w0 = w0 | |

prev_w1 = w1 | |

# get new weighting values | |

w0 = w0 + alpha*delta_w0 | |

w1 = w1 + alpha*delta_w1 | |

alpha -= 0.001 | |

# every few iterations print out current model | |

# 1. –> (w0 + w1x1 + w2x2 + … + wnxn) | |

print "Current model is: ("+str(w0)+" + "+str(w1)+"x1)" | |

# 2. –> averaged squared error over training set, using the current line | |

summed_error = sum_of_squared_error_over_entire_dataset(w0, w1, training_examples) | |

avg_error = summed_error/len(training_examples) | |

print "Average Squared Error="+str(avg_error) | |

# check if we have converged | |

if abs(prev_w0 – w0) < 0.00001 and abs(prev_w1 – w1) < 0.00001: | |

convergence = True | |

# after convergence, print out the parameters of the trained model (w0, … wn) | |

print "Parameters of trained model are: w0="+str(w0)+", w1="+str(w1) | |

return w0, w1 | |

############################ | |

##### TRAINING HELPERS ##### | |

############################ | |

def model_prediction(w0, w1, x_i): | |

return w0 + (w1 * x_i) | |

def prediction_error(w0, w1, x_i, y_i): | |

# basically, we just take the true value (y_i) | |

# and we subtract the predicted value from it | |

# this gives us an error, or J(w0,w1) value | |

return y_i – model_prediction(w0, w1, x_i) | |

def sum_of_squared_error_over_entire_dataset(w0, w1, training_examples): | |

# find the squared error over the whole training set | |

sum = 0 | |

for pair in training_examples: | |

x_i = pair[0] | |

y_i = pair[1] | |

sum += prediction_error(w0,w1,x_i,y_i) ** 2 | |

return sum |

Requires NumPy, same as above.

Also Requires data to be in this this format: [(x1,..xn, y),(x1,..xn, y) …(x1,..xn, y)], where Y is the actual value. Essentially, you can have as many x-variables as you’d like, as long as the y-value is the last element of each tuple.

This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.

Learn more about bidirectional Unicode characters

def multivariate_gradient_descent(training_examples, alpha=0.01): | |

""" | |

Apply gradient descent on the training examples to learn a line that fits through the examples | |

:param examples: set of all examples in (x,y) format | |

:param alpha = learning rate | |

:return: | |

""" | |

# initialize the weight and x_vectors | |

W = [0 for index in range(0, len(training_examples[0][0]))] | |

# W_0 is a constant | |

W_0 = 0 | |

# repeat until "convergence", meaning that w0 and w1 aren't changing very much | |

# –> need to define what 'not very much' means, and that may depend on problem domain | |

convergence = False | |

while not convergence: | |

# initialize temporary variables, and set them to 0 | |

deltaW_0 = 0 | |

deltaW_n = [0 for x in range(0,len(training_examples[0][0]))] | |

for pair in training_examples: | |

# grab our data points from the example | |

x_i = pair[0] | |

y_i = pair[1] | |

# calculate a prediction, and find the error | |

# needs to be an element-wise plus | |

deltaW_0 += multivariate_prediction_error(W_0, y_i, W, x_i) | |

deltaW_n = numpy.multiply(numpy.add(deltaW_n, multivariate_prediction_error(W_0, y_i, W, x_i)), x_i) | |

#print "DELTA_WN = " + str(deltaW_n) | |

# store previous weighting values | |

prev_w0 = W_0 | |

prev_Wn = W | |

# get new weighting values | |

W_0 = W_0 + alpha*deltaW_0 | |

W = numpy.add(W,numpy.multiply(alpha,deltaW_n)) | |

alpha -= 0.001 | |

# every few iterations print out current model | |

# 1. –> (w0 + w1x1 + w2x2 + … + wnxn) | |

variables = [( str(W[i]) + "*x" + str(i+1) + " + ") for i in range(0,len(W))] | |

var_string = ''.join(variables) | |

var_string = var_string[:–3] | |

print "Current model is: " + str(W_0)+" + "+var_string | |

# 2. –> averaged squared error over training set, using the current line | |

summed_error = sum_of_squared_error_over_entire_dataset(W_0, W, training_examples) | |

avg_error = summed_error/len(training_examples) | |

print "Average Squared Error="+str(sum(avg_error)) | |

print "" | |

# check if we have converged | |

if abs(prev_w0 – W_0) < 0.00001 and abs(numpy.subtract(prev_Wn, W)).all() < 0.00001: | |

convergence = True | |

# after convergence, print out the parameters of the trained model (w0, … wn) | |

variables = [( "w"+str(i+1)+"="+str(W[i])+", ") for i in range(0,len(W))] | |

var_string = ''.join(variables) | |

var_string = var_string[:–2] | |

print "RESULTS: " | |

print "\tParameters of trained model are: w0="+str(W_0)+", "+var_string | |

return W_0, W | |

################################ | |

##### MULTIVARIATE HELPERS ##### | |

################################ | |

# generalize these to just take a w0, a vector of weights, and a vector x-values | |

def multivariate_model_prediction(w0, weights, xs): | |

return w0 + numpy.dot(weights, xs) | |

# again, this needs to take just a w0, vector of weights, and a vector of x-values | |

def multivariate_prediction_error(w0, y_i, weights, xs): | |

# basically, we just take the true value (y_i) | |

# and we subtract the predicted value from it | |

# this gives us an error, or J(w0,w1) value | |

return y_i – multivariate_model_prediction(w0, weights, xs) | |

# should be the same, but use the generalize functions above, and update the weights inside the vector titself | |

# also need to have a vector fo delta_Wn values to simplify | |

def multivariate_sum_of_squared_error_over_entire_dataset(w0, weights, training_examples): | |

# find the squared error over the whole training set | |

sum = 0 | |

for pair in training_examples: | |

x_i = pair[0] | |

y_i = pair[1] | |

# cast back to values in range [1 –> 20] | |

prediction = multivariate_model_prediction(w0,weights,x_i) / (1/20.0) | |

actual = y_i / (1/20.0) | |

error = abs(actual – prediction) | |

error_sq = error ** 2 | |

sum += error_sq | |

return sum |

My data set is included in the full repo. But feel free to try it on your own, if you’re experimenting with this. And enjoy.

In which I peel back the curtain and outline the innerworkings of a particularly insidious artificial intelligence, whose sole purpose in life is to systematically learn the optimal strategy for a terrifyingly addictive video game, known only to the internet as: **Flappy Bird**… and in which I also provide code to program a similar AI of your own.

More pointedly, this short post outlines a practical way to get started using a** Reinforcement Learning** technique called **Q-Learning**, as applied to a **Python** **Flappy Bird clone**, programmed by @TimoWilken.

**>> Grab the code base: https://github.com/adpoe/Flappy-AI <<**

So you want to beat Flappy Bird, but after awhile it gets tedious. I agree. Instead, why don’t we program an AI to do it for us? A genius plan, but where do we start?

First, we need a **Flappy Bird** game to ~~hack~~ upate. The candidate that I suggest is a Python implementation created by Timo Wilken and available for download directly at: https://github.com/TimoWilken/flappy-bird-pygame. This Flappy Bird version is implemented using the **PyGame **library, which is a dependency going forward.

Here are instructions for PyGame installation. If you get this runing, the hard work is done. (apt-get or homebrew are highly recommended.)

The first challenge we’ll have in implementing the framework for a Flappy AI is determining exactly how the game workes in its original state.

By using the debugger and stepping through the game’s code during some trial runs, I was able to figure out where key decisions where made, how data flowed into the game, and exactly where I would need to position my AI-agent.

At its basic level, I created an “Agent” class, and passed that class into the running game code. Then, at each loop of the game, I examined the variables available to me, and then passed a ‘MOUSEBUTTONUP’ command to the PyGame event queue whenever the AI decided to jump. Otherwise, I did nothing.

From there, the next step was determining a way to model the problem. I decided to use follow the basic guidelines outlined by Sarvagya Vaish, here.

First, I discretized the space in which the bird sat, relative to the next pipe. I was able to get pipe data by accessing the **pipe object** in the original game code. Similarly, I was able to get bird data by accessing the **bird object**.

From there, I could determine the location of the **bird** and the** pipes** __relative to each other__. I discretized this space as a 25×25 grid, with the following parameters:

This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.

Learn more about bidirectional Unicode characters

# first value in state tuple | |

height_category = 0 | |

dist_to_pipe_bottom = pipe_bottom – bird.y | |

if dist_to_pipe_bottom < 8: # very close | |

height_category = 0 | |

elif dist_to_pipe_bottom < 20: # close | |

height_category = 1 | |

elif dist_to_pipe_bottom < 125: #mid | |

height_category = 2 | |

elif dist_to_pipe_bottom < 250: # far | |

height_category = 3 | |

else: | |

height_category = 4 | |

# second value in state tuple | |

dist_category = 0 | |

dist_to_pipe_horz = pp.x – bird.x | |

if dist_to_pipe_horz < 8: # very close | |

dist_category = 0 | |

elif dist_to_pipe_horz < 20: # close | |

dist_category = 1 | |

elif dist_to_pipe_horz < 125: # mid | |

dist_category = 2 | |

elif dist_to_pipe_horz < 250: # far | |

dist_category = 3 | |

else: | |

dist_category = 4 |

Using this methodology, I created a **state tuple **that looked like this:

(**height_category={0,1,2,3,4}, dist_category={0,1,2,3,4} , collision=True/False)**

Then, each iteration of the game loop, I was able to determine the bird’s relative position, and whether it had made a collision with the pipes or not.

**If there was no collision, I issued a reward of +1.**

**If there was a collision, I issued a reward of -1000.**

I tried many different state representations here, but mostly it was matter of determining an optimal number of grid spaces and the right parameters for those spaces.

Initially, I started with a 9×9 grid, but moved to 16×16 because I got to a point in 9×9 where I just couldn’t make any more learning progress.

Very generally, we want to have a ** tighter grid around the pipes**, as this is where most collisions happen. And we want a

Our next task is implementing an exploration approach. This is necessary because if we don’t randomly explore the state sometimes, there might be optimal strategies that we are never able to find, simply because we will never be in those states!

Because we have only two choices at any given state (JUMP—or—STAY), implementing exploration was relatively simple.

I started out with a high **exploration factor** (I used **1/time_value+1**), and then I generated a random number between [0,1). __If the random number was less than the exploration factor__, then I explored.

Over time the exploration factor got lower, and therefore the AI explored less frequently.

Exploration essentially consisted of flipping a fair coin (generating a Boolean value randomly).

**If true:**then I chose to JUMP.**If false:**I chose to STAY.

The main problem I encountered with this method is that the exploration factor was very at the beginning, and sometimes choices were made that were not representative of actual situations that the bird would encounter in ‘true’ gameplay.

BUT, because these decisions were made earlier, they were weighted more heavily in the overall Q-Learning algorithm.

This isn’t ideal, but exploration is necessary, and overall the algorithm works well. So it wasn’t a large problem, overall.

Very simply, “Learning Rates” dicatate how much we weigh new information about some state over old information. Learning rates can be an value in the range [0,1]. With 0 meaning we never update values (bad), and 1 meaning that we only EVER care about what happened the last time we were in state (short-sighted).

The first learning rate I tried was **alpha=(1/time+1).** However, this gave very poor results in practice.

This is because time is NOT the most important factor in determining a strategy from any given state. Rather, it is how many times we’ve been to that state.

The problem is that we make extremely poor choices at the beginning of the game (because we simply don’t know any better). But with** alpha=(1/time+1)**, the results of these these poor choices are weighted the most highly.

Once I changed the learning factor to **alpha=1/N(s,a),** I immediately saw * dramatically better* results. (That is, where N(s,a) tracks how many times we’ve been in a given state and performed the same action.)

My final, __“Smart” bird __is the result of about **4 hours of training**.

I don’t actually think there would be a way to make the training more efficient, aside from speeding up the gameplay in some way.

Overall, I the results I received from the investment of time I put it in reasonable.

__Given more time__, I would probably ** discretize the space even more finely** (maybe a 36×36 grid) – so that I could find even more optimal strategies from a more fine-tuned set of positions in the game-space.

To use my smart bird, simply take the following steps:

- cd into a directory containing my source code
- Ensure that this directory includes the file named ‘
**txt**’ - Run the command:
**python flappybird.py “qdata.txt”**

- Watch Flappy crush it. (the game will run 10x)

Probably more instructive than using my trained bird though, is to simply start training a new bird from scratch. You will see the agony and the ecstasy as he does a terrible number of dumb things, slowly learning how to beat the game.

It’s surprisingly enjoyable (though sometimes frustrating) and highly recommended. Start the process by running:

**python flappybird.py**

With no other args. Then pass in the ‘qdata.txt’ file next time you run the game, to keep your learning session going.

I consulted the following resources to implement my AI. If you want to do similar work, I’d recommend these resources. These people are much smarter than me. I’m just applying their concepts.

**Implementation ideas**based on discussion here: http://sarvagyavaish.github.io/FlappyBirdRL/**For understanding what I was doing better:**http://mnemstudio.org/path-finding-q-learning-tutorial.htm

While studying Operating Systems, I decided to experiment with 4 of the most common Page Replacement algorithms, via simulation, using Python. Input data for my simulations were two memory traces named “swim.trace” and “gcc.trace” (provided by my CS department). All algorithms are run within a page table implemented for a 32-bit address space; all pages in this page table are 4kb in size.

I chose to implement these algorithms and the page table in Python. **The final source code can be found here, on Github**. The full data set that I collected can be found at the end of this document. Illustrative graphs are interspersed throughout, wherever they are necessary for explaining and documenting decisions.

In this essay itself, I will first outline the algorithms implemented, noting design decisions I made during my implementations. I will then compare and contrast the results of each algorithm when run on the two provided memory traces. Finally, I will conclude with my decision about which algorithm would be best to run in a real Operating System.

Please note, to run the algorithms themselves, you must run:

- python vmsim.py –n -a <opt|clock|aging|lru> [-r ]

** **

The algorithms we have been asked to implement are:

**Opt**– Simulate what the optimal page replacement algorithm would choose if it had perfect knowledge**EXAMPLE RUN**: python vmsim.py –n 8 –a opt gcc.trace

**Clock**– Use the better implementation of the second-chance algorithm**EXAMPLE RUN**: python vmsim.py –n 16 –a clock swim.trace

**Aging**– Implement the aging algorithm that approximates LRU with an 8-bit counter**EXAMPLE RUN**: python vmsim.py –n 32 –a aging –r 1 gcc.trace

**LRU**– Do exact LRU.**EXAMPLE RUN**: python vmsim.py –n 64 –a lru swim.trace

The main entry point for the program is the file named **vmsim.py**. This is the file which must be invoked from the command line to run the algorithms. Because this is a python file, it must be called with “python vmsim.py … etc.”, instead of “./vmsim”, as a C program would. __Please make sure the selected trace file is in the same directory as vmsim.py__.

All algorithms run within a page table, the implementation of which can be found in the file named **pageTable.py**.

Upon program start, the trace file is parsed by the class in the file **parseInput.py**, and each memory lookup is stored in a list of tuples, in this format: [(MEM_ADDRESS_00, R/W), [(MEM_ADDRESS_01, R/W), … [(MEM_ADDRESS_N, R/W)]. This list is then passed to whichever algorithm is invoked, so it can run the chosen algorithm on the items in the list, element by element.

The **OPT algorithm **can be found in the file **opt.py**. My implementation of OPT works by first preprocessing all of the memory addresses in our Trace, and creating a **HashTable **where the * key is our VPN*, and the

The **Clock Algorithm** is implemented in the files **clock.py** and **circularQueue.py**. My implementation uses the second chance algorithm with a Circular Queue. Of importance: whenever we need to make an eviction, but fail to find ANY pages that are clean, we then run a ‘swap daemon’ which writes out ALL dirty pages to disk at that time. This helps me get fewer page faults, at the expense of more disk writes. That’s a calculated decision for this particular algorithm, since the project description says we should use page faults as our judgment criterion for each algorithm’s effectiveness.

The **LRU Algorithm** can be found in the file **lru.py**. For LRU, each time a page is Read, I mark the memory address number at which this happens in the frame itself. Then, in the future—whenever I need to make an eviction—I have easy access to see which frame was used the longest time in the past, and no difficult calculations are needed. This was the simplest algorithm to implement.

The **Aging Algorithm **is likely the most complex, and its source code can be found in the file named **aging.py**. Aging works by keeping an 8 bit counter and marking whether each page in the page table was used during the last ‘tick’ a time period of evaluation which must be passed in by the user as a ‘refresh rate’, whenever the Aging Algorithm is selected. All refresh rates are in milliseconds on my system, but this relies on the implementation of Python’s “time” module, so it’s possible that this could vary on other systems. **For aging, I suggest a refresh rate of 0.01 milliseconds**, passed in on the command line as “-r 0.01”, in the 2^{nd} to last position in the arg list. This minimizes the values in my testing, and going lower does not positively affect anything. In the next section, I will show my rationale for selecting 0.01 as my refresh rate.

In order to find a refresh rate that would work well, I decided to start at 1ms and move 5 orders of magnitude in each direction, from 0.00001ms to 100000ms.

To ensure that the results were not biased toward being optimized for a single trace, I tried both to confirm that the refresh rate would work will for all inputs.

The graphs below show the Page Faults and Disk Writes I found during each test.

** For all tests, I chose a frame size of 8, since this small frame size is most sensitive to the algorithm used. **At higher frame sizes, all of the algorithms tend to perform better, across the board. So I wanted to focus on testing at the smallest possible size, preparing for a ‘worst case’ scenario.

**X-axis: ** Refresh rate in milliseconds

**Y-axis:** Total Page faults

GCC. TRACE reaches its minimum for page faults at 0.0001ms and SWIM.TRACE reaches its own minimum at 0.01ms. The lines cross at around 0.01ms.

** **

**X-axis:** Page Faults

**Y-Axis:** Refresh Rates

**X-axis:** Page Faults

**Y-Axis:** Refresh Rates

Additionally, 0.01ms seems to achieve the best balance, in my opinion, if we need to select a single time for BOTH algorithms.

** **** **

**X-axis:** Refresh rate in milliseconds

**Y-axis: ** Total Disk Writes

The number of disk writes also bottoms out at 0.01ms from SWIM.TRACE. It is relatively constant for GCC.TRACE across all of the different timing options.

**Because of this, I suggest 0.01ms as the ideal refresh rate**. This is because it is optimal for SWIM.TRACE. For GCC.TRACE, it is not the absolute best option, but it is still acceptable, and so I think this selection will achieve a good balance.

With the algorithms all implemented, my next step was to collect data for each algorithm at all frame sizes, 8, 16, 32, and 64. OPT always performed best, and thus it was used as our baseline.

In the graphs below, I show how each algorithm performed, both in terms of total page faults and total disk writes.

**X-axis: ** Frame Size

**Y-axis: ** Page Faults

Data for all algorithms processing swim.trace

**X-axis:** Frame Size

**Y-axis: ** Disk Writes

Data for all algorithms processing swim.trace

**X-axis:** Frame Size

**Y-axis:** Page Faults

Data for all algorithms processing gcc.trace

**X-axis:** Frame Size

**Y-axis:** Disk Writes

Data for all algorithms processing gcc.trace

—

Given this data, I was next tasked with choosing which algorithm is most appropriate for an actual operating system.

In order to determine which algorithm would be best, I decided to use an algorithm. I’ll call it the ‘Decision Matrix’, and here are the steps.

**DECISION MATRIX: **

**RANK ALGORITHMS FOR EACH CATEGORY FROM 1=BEST to 4=WORST,****SELECT ALGORITHM WITH LOWEST OVERALL SCORE**

ALGORITHM |
SWIM – Page Faults |
SWIM – Disk Writes |
GCC – Page Faults |
GCC – Disk Writes |
TOTAL (Lowest is Best) |

OPT | 1 | 1 | 1 | 1 | 4 |

CLOCK | 2 | 4 | 2 | 4 | 12 |

AGING | 4 | 2 | 4 | 2 | 12 |

LRU | 3 | 3 | 3 | 3 | 12 |

Ranking: 1=Best;

** 4=Worst**

*NOTE:*

*Wherever lines cross each other in our graphs, the algorithm ranked as “better” is the one with MORE total low data points. Ties were broken by my own personal judgment.*

- Of course, using this set of judgment criteria, OPT is by far the winner.
- However OPT isn’t an option in a real OS, since it requires perfect knowledge of the future, which is impossible in practice.
*So we*.**want to pick between the 3-way tie**for CLOCK and LRU and AGING- Of these three:
- AGING does well on disk writes, but generally has the most page faults.
- Conversely, CLOCK does well on page faults, but often has the most disk writes.
- LRU achieves a balance.

__Therefore, My pick goes to LRU__, because where clock does beat LRU, on page faults it does so only by a narrow margin.- But where LRU beats clock—on disk writes—it does so by a large amount. This means LRU is better overall, if both page faults and disk writes are equally weighted for judgment purposes.

- Of these three:

**Therefore, I would select LRU for my own operating system.**

**Figure 1.1 – Full Data Set**

ALGORITHM |
NUMBER OF FRAMES |
TOTAL MEMORY ACCESSES |
TOTAL PAGE FAULTS |
TOTAL WRITES TO DISK |
TRACE |
REFRESH RATE |

OPT | 8 | 1000000 | 236350 | 51162 | swim.trace | N/A |

OPT | 16 | 1000000 | 127252 | 27503 | swim.trace | N/A |

OPT | 32 | 1000000 | 52176 | 11706 | swim.trace | N/A |

OPT | 64 | 1000000 | 24344 | 6316 | swim.trace | N/A |

OPT | 8 | 1000000 | 169669 | 29609 | gcc.trace | N/A |

OPT | 16 | 1000000 | 118226 | 20257 | gcc.trace | N/A |

OPT | 32 | 1000000 | 83827 | 14159 | gcc.trace | N/A |

OPT | 64 | 1000000 | 58468 | 9916 | gcc.trace | N/A |

CLOCK | 8 | 1000000 | 265691 | 55664 | swim.trace | N/A |

CLOCK | 16 | 1000000 | 136154 | 52104 | swim.trace | N/A |

CLOCK | 32 | 1000000 | 73924 | 45872 | swim.trace | N/A |

CLOCK | 64 | 1000000 | 56974 | 43965 | swim.trace | N/A |

CLOCK | 8 | 1000000 | 178111 | 38992 | gcc.trace | N/A |

CLOCK | 16 | 1000000 | 122579 | 26633 | gcc.trace | N/A |

CLOCK | 32 | 1000000 | 88457 | 20193 | gcc.trace | N/A |

CLOCK | 64 | 1000000 | 61832 | 15840 | gcc.trace | N/A |

AGING | 8 | 1000000 | 257952 | 52664 | swim.trace | 0.01ms |

AGING | 16 | 1000000 | 143989 | 41902 | swim.trace | 0.01ms |

AGING | 32 | 1000000 | 91852 | 29993 | swim.trace | 0.01ms |

AGING | 64 | 1000000 | 82288 | 27601 | swim.trace | 0.01ms |

AGING | 8 | 1000000 | 244951 | 31227 | gcc.trace | 0.01ms |

AGING | 16 | 1000000 | 187385 | 22721 | gcc.trace | 0.01ms |

AGING | 32 | 1000000 | 161117 | 19519 | gcc.trace | 0.01ms |

AGING | 64 | 1000000 | 149414 | 16800 | gcc.trace | 0.01ms |

LRU | 8 | 1000000 | 274323 | 55138 | swim.trace | N/A |

LRU | 16 | 1000000 | 143477 | 47598 | swim.trace | N/A |

LRU | 32 | 1000000 | 75235 | 43950 | swim.trace | N/A |

LRU | 64 | 1000000 | 57180 | 43026 | swim.trace | N/A |

LRU | 8 | 1000000 | 181950 | 37239 | gcc.trace | N/A |

LRU | 16 | 1000000 | 124267 | 23639 | gcc.trace | N/A |

LRU | 32 | 1000000 | 88992 | 17107 | gcc.trace | N/A |

LRU | 64 | 1000000 | 63443 | 13702 | gcc.trace | N/A |

**Figure 1.2 – GCC.TRACE – Refresh Rate Testing – 8 Frames**

ALGORITHM |
NUMBER OF FRAMES |
TOTAL MEMORY ACCESSES |
TOTAL PAGE FAULTS |
TOTAL WRITES TO DISK |
TRACE |
REFRESH RATE |

AGING | 8 | 1000000 | 192916 | 33295 | gcc.trace | 0.00001ms |

AGING | 8 | 1000000 | 192238 | 33176 | gcc.trace | 0.0001ms |

AGING | 8 | 1000000 | 197848 | 31375 | gcc.trace | 0.001ms |

AGING | 8 | 1000000 | 244951 | 31227 | gcc.trace | 0.01ms |

AGING | 8 | 1000000 | 339636 | 140763 | gcc.trace | 0.1ms |

**Figure 1.3 –SWIM.TRACE – Refresh Rate Testing – 8 Frames**

ALGORITHM |
NUMBER OF FRAMES |
TOTAL MEMORY ACCESSES |
TOTAL PAGE FAULTS |
TOTAL WRITES TO DISK |
TRACE |
REFRESH RATE |

AGING | 8 | 1000000 | 275329 | 53883 | swim.trace | 0.00001ms |

AGING | 8 | 1000000 | 274915 | 53882 | swim.trace | 0.0001ms |

AGING | 8 | 1000000 | 268775 | 53540 | swim.trace | 0.001ms |

AGING | 8 | 1000000 | 257952 | 52664 | swim.trace | 0.01ms |

AGING | 8 | 1000000 | 278471 | 56527 | swim.trace | 0.1ms |

You must be logged in to post a comment.