A repository for posting assignments, code, and informal responses for Professor Frazier's DATA 310 course, Applied Machine Learning.
Using the script you produced with the Higgs Dataset, answer the following questions.
Describe the dataset. What type of variable is the target? How many features are being used? How many observations are in the training dataset? How many are used in the validation set? The Higgs Dataset is comprised of millions of observations (specifically, 11 million observations) collected by the Large Hadron Collider. There are 28 relevant features used in the machine learning practice problem, which relies on a training set of 10,000 observations and a validation or tes\ting set of 1,000 to construct a predictive model.
How did each of the four models perform (tiny, small, medium and large)? Which of the four models performed the best? Which ones performed the worst? Why in your estimation did certain models perform better? Produce a plot that illustrates and compares all four models. Out of the four model sizes available, it seems that the tiny model performed the best. There appeared to be the lowest variation between the losses and actuals, and the least amount of stagnation in the validation metric. Meanwhile, the medium and large models performed worst and almost equally as terribly as one another (it is difficult to tell which is performing worse, as the bounds of the graph cut off the training metric lines). In my estimation, the smaller models performed better because it has a smaller number of “learnable parameters,” or capacity, that the models had to adapt to or “learn” in order to predict reasonable results.
Complete the script “Load and preprocess images” and then also watch the “Convolutions and pooling” video by Laurence Maroney of Google Developers. After watching the video, answer the following questions.
The first filter I applied was the “default” filter, which transformed the image through a series of convolutions (e.g., iterated over the image, “leaving a 1 pixel margin,” and multiplied out “each of the neighbors of the current pixel by the value defined in the filter”). The end result emphasized the vertical lines in the image by limiting the range of the grayscale values through convolution:
filter = [ [-1, -2, -1], [0, 0, 0], [1, 2, 1] ]
The second filter I applied emphasized the horizontal lines in the image:
filter = [ [-1, 0, 1], [-2, 0, 2], [-1, 0, 1] ]
For the last filter I applied, I wanted to try something that might change the image a little more drastically. It ended up highlighting the contrast between all of the lines present in the original image:
filter = [ [-10, 0, 10], [-25, 0, 25], [-10, 0, 10] ]
In applying a filter to the original array associated with the image, the code is essentially creating a funnel in key areas or blocks through which only certain values can pass. These values will go on to represent the values around them in the end result (e.g., the final plot or image). Convolving filters are extremely useful in computer vision because they allow for the detection of unique patterns, shapes, or symbols that the naked human eye may not be able to pick up on. Additionally, they can help humans learn more about images they are already familiar with, or classify different images as part of a model.
By applying a pooling filter, the overall amount of information in an image while retaining its most important features. The logic behind pooling in this example exercise appears to be associated with maximizing values, as the specific method of pooling used is “max pooling,” in which a 2x2 pixel matrix from a given image is considered and the largest value of the four pixels is selected to represent the whole matrix, thereby reducing the image’s overall size. Like the application of convolution filters to machine vision, pooling is useful because it helps detect unique features in images while simplifying them to the most base representation of what they are. Additionally, the process includes making the image smaller during and for future processing.
[0 0 0 3 0 0 0]
[0 0 0 3 0 0 0]
[1 1 1 3 1 1 1]
[1 1 1 3 1 1 1]
[1 1 1 3 1 1 1]
[0 0 0 3 0 0 0]
[0 0 0 3 0 0 0]