Login Get started

No Black Box Machine Learning Course – Learn Without Libraries

freeCodeCamp.org・9 minutes read

The course teaches machine learning in JavaScript without libraries, covering data collection, feature extraction, classifiers, and building a drawing app, transitioning from software engineering to machine learning. It instructs on creating tools for data collection, processing data, visualizing features, and implementing a K nearest neighbors classifier, emphasizing importance of data scaling techniques and evaluation of classifier accuracy.

Insights

Course teaches machine learning in JavaScript without libraries, focusing on software development skills initially and transitioning to machine learning concepts.
Data collection, processing, and visualization are emphasized, with features like drawing app creation, feature extraction, and classifier implementation covered.
Implementation of advanced machine learning concepts like K nearest neighbors classifier and decision boundary computation for accurate data classification.
Integration with Python for quicker completion and evaluation of machine learning models, with a focus on K Nearest Neighbors (KNN) implementation and accuracy assessment.

Get key ideas from YouTube videos. It’s free

Recent questions

What does the course teach?
Machine learning in JavaScript without libraries.
Who is the instructor?
Radu, a Computer Science PhD and university lecturer.
What is emphasized in data collection?
Importance of processing and visualizing collected data.
What is the goal of the course?
Enhancing software development skills.
How is data saved locally?
Using a function named "save" to save files.

Related videos

Summary

00:00

JavaScript Machine Learning Course by Radu

Course teaches machine learning in JavaScript without libraries, enhancing software development skills.
Taught by Radu, a Computer Science PhD and university lecturer.
Covers building a drawing app, working with data, feature extraction, and classifiers.
Focuses on software engineering initially, transitioning to machine learning.
Teaches data collection tool creation using a drawing app for desktop and mobile.
Instructs on processing and visualizing collected data, emphasizing importance.
Covers feature extraction and visualization, building a chart component from scratch.
Introduces nearest neighbor classifier implementation and data scaling importance.
Implements advanced K nearest neighbors classifier and decision boundary computation.
Offers a review lesson in Python, using libraries for quicker completion.

19:16

Drawing utility for touch devices with undo.

The function "path" allows specifying the context and path to draw.
A draw utility object named "draw" is created in a separate file called draw.js.
The object "draw" includes a path function with parameters for context, path, and a default color of black.
The path function sets the stroke style to black and draws a line with a width of three.
The move to function spreads an array into its components for drawing points.
Multiple paths are drawn by iterating through all paths and drawing each with specified properties.
To optimize for mobile devices, a viewport meta tag is added to adjust the width and prevent zooming.
Event listeners for touch events are added to ensure drawing functionality on touch devices.
An undo button is implemented to remove the last drawn path.
Data collection functionality is added to record the student's name, session ID, and drawings of various objects.

41:21

"Save Locally Drawn Paths with Timestamped File"

The button "thank you" will change to "save" to save locally drawn paths.
The new function "save" will allow saving files on the local computer.
Instructions will guide on downloading and placing the file in the dataset.
To create and download the file, an element with a href attribute will be used.
Data will be converted to a JSON string for saving.
JSON is a standard format for storing data, readable and supported by many languages.
The file will be named uniquely based on a generated timestamp.
The download action will be triggered for the file.
The event listener for mouse up will be added to the document for better user experience.
A spreadsheet is available for testing the system on different devices and reporting issues.

01:02:22

Fixing module recognition, constants, progress indicator, viewer app.

In draw.js, the browser doesn't recognize modules, but this can be fixed by checking if the module type is not undefined.
Applying a similar structure to constants by separating them into a new file called constants.js and requiring it in the main file.
Implementing a progress indicator in a new file called utils.js by creating a function called print progress with count and max parameters.
Exporting the utils object and including it in the data set generation to display a progress indicator.
Storing data in a format readable by browsers by creating a samples JavaScript file in a JS objects directory.
Creating a viewer app in viewer.html with basic HTML structure, including meta tags, title, external style sheet, and JavaScript files.
Grouping samples by student ID using a group by function in utils.js and displaying them in a table format in viewer.html.
Creating a function called create row in display.js to display rows with student names and their corresponding samples.
Implementing a blur effect for flagged users' drawings by adding a blur class in CSS and applying it to images of flagged users in display.js.

01:25:48

"Drawing Comparison: Traditional vs. Quickdraw Data"

The speaker admires various drawings, including a Christmas tree and horses, noting the uniqueness of each.
Mention of detailed drawings like Jin-Zon-Ning and Zhong-Nanago, highlighting the time-consuming nature of their creation.
Comparison to Quickdraw data, noting the absence of time limits and the presence of an undo button for improved drawing quality.
Encouragement for viewers to share their page styling versions for potential showcase in a future video.
Introduction of functions in a file named features.js to extract features from samples, starting with a function to count paths.
Explanation of a function to count points from paths by flattening arrays and exporting for node usage.
Creation of a feature extractor script in a node directory, reading samples, looping through them, and extracting features like path and point counts.
Outputting feature data to a features file, with a mention of excluding extra information if necessary.
Update of a web application to display feature data using Google charts, with details on setting up options, loading packages, and generating data tables.
Implementation of color styling based on labels, with a mention of transparency challenges and a switch to a custom chart library for better control over styling and features.

01:50:23

"Dynamic Chart Features with Color Emphasis"

Images for different objects are generated based on specified colors using graphics in utils.
Colors of shapes are determined by colors specified in utils.
Chart supports a callback function for item selection, implemented as handle click in display.js.
handle click function in display.js adds an emphasize class to selected items.
Emphasize class in style CSS sets background color to yellow for emphasized items.
Scroll to view element with auto behavior added in display.js after emphasizing.
Emphasized items are stored in an array and the emphasize class is removed from each item.
Chart and content are separated with chart fixed on the side in style CSS.
Control panel added to toggle visibility of input container for sketchpad.
Dynamic point added to chart to show location of drawn features, moving as drawing progresses.

02:14:35

Enhancing Point Visibility in Data Analysis

The goal is to make a specific point more visible on a faded background when refreshing the page.
The value of the point needs to be large for visibility when zooming in on a chart.
To ensure the point remains visible even when zoomed in and at extreme corners, adjustments are needed.
Toggling input visibility should hide the point above the dynamic input section.
Implementing a trigger update method in Sketchpad JS helps update the point's visibility.
Calculating the width and height of drawings is crucial for structuring data.
Adjusting feature functions and the feature extractor to include width and height calculations.
Ensuring the dataset features are regenerated to reflect the new width and height values.
Addressing outliers in data, like drawings exceeding canvas boundaries, is common and requires attention.
Classifying drawings based on extracted features and finding the nearest sample for classification are key steps in the process.

02:38:29

Rectify chart distortion with proper data display.

The chart's misleading appearance is due to a square aspect ratio, causing data distortion.
To rectify this, calculate the delta on x and y to determine the maximum value for proper chart display.
Adjust the chart to eliminate squishing or stretching, ensuring accurate data representation.
Feature functions like path count and point count impact data classification differently.
Normalize data to ensure equal significance of all features for fair classification.
Data scaling techniques like normalization remap values between zero and one for uniform importance.
Outlier points can disrupt normalization, prompting consideration of standardization for better results.
Implement the k nearest neighbors classifier to determine classification based on majority votes.
Modify the nearest neighbor search to return multiple neighbors for improved classification accuracy.
Count and identify the majority label among the nearest samples for accurate classification.

03:02:35

"Optimizing K-nearest neighbor classifier accuracy"

The process involves passing a sample point through a for loop and saving the file before refreshing the page to draw and analyze lines.
Despite the proximity of other objects, the system selects clocks due to their higher quantity.
Variants of nearest neighbor classifiers, like distance-weighted versions, exist, and implementing them is encouraged.
To evaluate the classifier, data is split into training and testing sets, with correct classifications counted to compute accuracy.
The data is split into training and testing sets, with the former comprising 50% of the samples.
The training and testing sets are written into files, ensuring no overlap between the two for accurate testing.
Normalization is crucial, with data normalized using training set values to avoid discrepancies.
The classification process involves classifying all unknown points at once, with labels determined based on the training data.
The accuracy of the K-nearest neighbor classifier is calculated, with different values of K tested for optimal results.
Decision boundaries are introduced as a more informative method than visual representations like the spider, aiding in understanding the classifier's decisions.

03:29:01

Implementing KNN Classifier for Image Classification

KNN classifier is included in the code for classification purposes.
The classifier is instantiated using training samples and a specific value for K.
The classify function is removed from the sketchpad, streamlining the code.
Evaluation is run after printing accuracy, followed by generating a decision boundary plot.
A canvas of 100x100 is created using the create canvas function.
The plot is pixel-based, with each pixel representing a feature between 0 and 1.
Normalized points are created for each pixel to color based on predicted values.
The decision boundary plot is saved as a PNG image in the data set directory.
The image is set as the background of the chart in the viewer HTML file.
JavaScript is used to prepare data for Python by converting samples to CSV format.
The feature extractor JavaScript file is modified to output training and testing data in CSV form.
Python implementation of K Nearest Neighbors (KNN) is done using libraries.
Training and testing data are read from CSV files and parsed into usable formats.
K Neighbors Classifier is instantiated with 50 neighbors using brute force algorithm and uniform weights.
The accuracy of the model is calculated and printed in the terminal.
A function to read feature files is created for reusability.
Suggestions are made to explore further with matplotlib for displaying feature values and decision boundaries.

Try it yourself — It’s free.